How it works
MMR iteratively selects documents by balancing query relevance against redundancy with already-selected documents.MMR algorithm
MMR iteratively selects documents by scoring each candidate with:- λ (lambda_param) - Trade-off between relevance and diversity (0.0-1.0)
- sim(d, query) - Cosine similarity between document and query embeddings
- max_sim(d, selected) - Maximum similarity to any already-selected document
Selection process
- First document - Select most relevant to query
- Iterative selection - For remaining slots:
- Calculate relevance to query for each candidate
- Calculate redundancy (max similarity to selected docs)
- Compute MMR score with lambda weighting
- Select document with highest MMR score
- Repeat - Until k documents selected
Lambda parameter guidelines
Controls the relevance-diversity trade-off:
λ = 1.0- Pure relevance ranking (no diversity penalty)λ = 0.7-0.8- Emphasize relevance, mild diversity (recommended for precision)λ = 0.5- Balanced relevance and diversity (good default)λ = 0.3-0.4- Emphasize diversity (recommended for exploratory search)λ = 0.0- Pure diversity (minimum redundancy, ignores relevance)
Key features
- Tune relevance vs diversity to fit the task
- Uses cosine similarity for both relevance and diversity scoring
- Particularly useful for summarization and exploratory search
- Greedy algorithm ensures efficiency
Implementation
Use cases
Exploratory search
When users need to understand different aspects of a topic:Multi-document summarization
Provide diverse context to LLMs:Reducing near-duplicates
When search returns many similar results:High-precision search
When relevance is critical:Lambda parameter tuning
Task-specific recommendations
| Task type | Recommended λ | Rationale |
|---|---|---|
| Q&A systems | 0.7-0.8 | Prioritize relevant answers |
| Exploratory search | 0.3-0.4 | Show diverse perspectives |
| Summarization | 0.4-0.6 | Balance coverage and relevance |
| Deduplication | 0.2-0.4 | Maximize uniqueness |
| Fact verification | 0.6-0.7 | Relevant but diverse sources |
Tuning guidelines
Adjust based on metrics
- Too much redundancy? Decrease lambda (more diversity)
- Missing relevant results? Increase lambda (more relevance)
Example with full pipeline
Performance characteristics
Time complexity
- First selection: O(n) to find most relevant
- Subsequent selections: O((k-1) × n) for k selections from n candidates
- Overall: O(k × n)
Space complexity
- O(n × d) for storing embeddings (n docs, d dimensions)
- Cosine similarity computed on-demand
Optimization tips
Comparison with other diversity methods
| Method | Approach | Speed | Use case |
|---|---|---|---|
| MMR | Query-aware greedy selection | Fast | General diversity with relevance |
| Clustering | K-means + sampling | Moderate | Topic coverage |
| Threshold filtering | Similarity cutoff | Fastest | Simple deduplication |
| Graph-based | Community detection | Slow | Complex relationships |
Integration with diversity filtering
MMR is one of the diversity methods available in the diversity filtering pipeline:Related features
Diversity filtering
Complete diversity pipeline with MMR and clustering
Semantic search
Initial retrieval before MMR
Reranking
Cross-encoder scoring alternative
Contextual compression
Reduce retrieved context