
Few-shot prompting is a technique where you include a small number of worked examples (typically 2-10) directly in the prompt to demonstrate the desired task, format, and reasoning pattern to a Large Language Model. Instead of fine-tuning the model or relying solely on instructions, you show the model what you want by providing input-output pairs — and the LLM learns from these examples in-context, without any weight updates. Few-shot prompting typically improves accuracy by 20-30% compared to zero-shot prompting (instructions alone), making it one of the highest-ROI prompt engineering techniques available. The quality of examples matters more than quantity: three carefully chosen, representative examples consistently outperform ten mediocre ones.
Why it matters
Few-shot prompting is the most cost-effective way to dramatically improve LLM output quality without any model training. For tasks requiring structured output — entity extraction, classification, data transformation, code generation in specific patterns — few-shot prompting closes the gap between a generic instruction ("extract entities from this text") and a fine-tuned model. Real-world accuracy improvements are well-documented: entity extraction jumps from 60% to 85%, classification from 72% to 88%, and mathematical reasoning from 30% to 60%. For businesses, this means that spending 30 minutes crafting good examples can save months of fine-tuning work and thousands of euros in training compute — while being instantly updatable by swapping examples rather than retraining a model.
How it works
When an LLM receives a prompt containing examples, it identifies the patterns across those examples — input format, output structure, reasoning approach, edge case handling — and applies those patterns to the new input. This works because LLMs are fundamentally pattern-completion machines: they predict what should come next based on the preceding context. By structuring the context as a series of solved examples followed by an unsolved instance, the model continues the established pattern. The effectiveness depends on several factors: example diversity (covering different cases), example quality (correct and clearly formatted), ordering (strongest examples first), and format consistency (identical structure across examples). Few-shot prompting consumes more tokens than zero-shot — each example adds to the input cost — creating a natural tension between accuracy and cost that teams must optimize for their use case.
Example
An e-commerce company needs to extract product attributes from unstructured supplier descriptions. Their zero-shot prompt ("Extract brand, color, material, and size from this product description") achieves 65% accuracy — frequently missing attributes mentioned indirectly or using unusual formatting. They add three few-shot examples covering typical patterns: a straightforward description with explicit attributes, one where attributes are implied ("Italian leather" → material: leather, origin: Italy), and one with multiple products in a single description. Accuracy jumps to 89%. They add two edge-case examples — descriptions in mixed languages and abbreviated specifications — pushing accuracy to 93%. The total prompt grows from 80 tokens to 450 tokens, increasing per-request cost by 5×, but the 93% accuracy eliminates the manual review step that previously cost €8,000 per month in human labor.