Overview
All synthesizers extendBaseSynthesizer and implement the synthesize() method. They differ in how they combine multiple text chunks into a coherent response.
Synthesis Modes
LlamaIndex provides four built-in synthesis strategies:Compact (Default)
Best for: Most use cases, balances quality and efficiency. Compacts text chunks to fit within the context window, then refines the response:- Combines chunks to maximize context window usage
- Generates initial response from first compact chunk
- Refines response with subsequent chunks
Refine
Best for: Comprehensive answers requiring all context. Builds response iteratively, refining with each chunk:- Generate initial answer from first chunk
- For each subsequent chunk:
- Present existing answer + new chunk
- Ask LLM to refine the answer
- Return final refined answer
- Most comprehensive, considers all context
- Good for complex queries
- Requires multiple LLM calls (one per chunk)
- Slower and more expensive
Tree Summarize
Best for: Summarization tasks, parallel processing. Recursively summarizes chunks in a tree structure:- Pack chunks to fit context window
- If single chunk: generate answer directly
- If multiple chunks:
- Summarize each chunk in parallel
- Recursively summarize summaries
- Return final summary
- Parallelizable (faster for many chunks)
- Good for summarization
- May lose details in recursive summarization
- Not ideal for precise Q&A
Multi-Modal
Best for: Images and multi-modal content. Handles images and other non-text content:- Preserves multi-modal content (text + images)
- Formats prompt with all content types
- Sends to multi-modal LLM
Factory Function
UsegetResponseSynthesizer() for simple cases:
"compact"- CompactAndRefine"refine"- Refine"tree_summarize"- TreeSummarize"multi_modal"- MultiModal
Streaming Responses
All synthesizers support streaming:Custom Prompts
Customize the prompts used by synthesizers:Using with Query Engines
Integrate synthesizers into query engines:Custom Synthesizers
Implement custom synthesis logic:Choosing a Synthesizer
| Synthesizer | Speed | Quality | Cost | Best For |
|---|---|---|---|---|
| Compact | Fast | Good | Low | General Q&A |
| Refine | Slow | Best | High | Complex queries |
| Tree Summarize | Medium | Good | Medium | Summarization |
| Multi-Modal | Fast | Good | Low | Images + text |
Best Practices
Prompt Engineering:- Customize prompts for your domain
- Include examples in prompts for better results
- Test prompts with different synthesizers
- Use
compactfor most cases (good balance) - Use
tree_summarizewhen you have many chunks - Avoid
refineunless you need maximum quality
- Retrieve more nodes than needed, let synthesizer select best ones
- Use postprocessors before synthesis to filter nodes
- Monitor token usage to avoid context window issues
Next Steps
Postprocessors
Filter and rerank nodes before synthesis
Evaluation
Measure and improve response quality