
What is Structured Output?
Structured output (also called JSON mode or constrained generation) is a capability that forces a language model to produce output in a specific, machine-readable format — typically JSON conforming to a predefined schema. Instead of free-form text, the model returns data that applications can reliably parse and use.
Why It Matters
LLMs naturally produce free-form text, but applications need structured data — function parameters, database entries, API payloads, form fields. Structured output bridges this gap reliably. Without it, developers must write brittle parsers to extract data from prose, which fails unpredictably. Structured output makes LLMs usable as reliable components in software systems.
How It Works
Structured output can be achieved through several mechanisms:
1. API-level enforcement (most reliable):
- The API accepts a JSON Schema alongside the prompt
- The model's token sampling is constrained to only produce tokens that form valid JSON matching the schema
- Guaranteed to produce valid output: no parsing errors possible
- Available in: OpenAI (structured outputs), Anthropic (tool use), Google (response schema)
2. Guided generation (open-source):
- Libraries like Outlines, Instructor, or llama.cpp apply grammar-based constraints during generation
- Token probabilities are masked to enforce the desired format
- Works with any model that exposes logits
3. Prompt-based (least reliable):
- Instruct the model via prompt: "Respond in JSON with these fields..."
- No enforcement — model may deviate, add commentary, or produce invalid JSON
- Useful when API enforcement isn't available
Common use cases:
- Extracting structured data from documents (invoices, emails, reports)
- Generating function call parameters
- Creating database records from natural language
- Building data pipelines with LLM-powered extraction
- Classification with confidence scores
Example
A developer building an expense tracker sends receipt photos to the API with a JSON Schema requiring {vendor: string, amount: number, currency: string, date: string, category: string}. The model extracts the information and returns guaranteed-valid JSON that the application can directly insert into the database.