How it works
Hybrid search combines dense semantic embeddings with sparse lexical embeddings for enhanced retrieval quality.Search process
- Dual embedding - Query is embedded using both dense and sparse embedders
- Parallel retrieval - Dense embedding captures semantic query intent, sparse embedding captures exact term matches
- Score fusion - Results are fused using RRF or weighted scoring:
alpha * dense_score + (1-alpha) * sparse_score - Ranked results - Documents are ranked by the fused score
Fusion mechanisms
Reciprocal Rank Fusion (RRF) RRF combines rankings from multiple sources using the formula:alpha = 1.0- Pure dense/semantic searchalpha = 0.0- Pure sparse/keyword searchalpha = 0.5- Balanced hybrid (default)
Key features
- Dense + sparse fusion with configurable weights
- RRF handles score normalization automatically
- Built-in evaluation metrics: Recall@k, MRR, NDCG, Precision@k
- No FastEmbed dependency—uses native SentenceTransformers sparse encoders
Implementation
Configuration
Required settings
Pinecone API authentication key
Target index name for hybrid search
Optional settings
Fusion weight (0.0=sparse only, 1.0=dense only, 0.5=balanced hybrid)
Namespace within the index for document isolation
Dense embedder configuration for semantic vector generation
Sparse embedder configuration for lexical vectors
Example configuration
Search parameters
Search query text to embed with both dense and sparse embedders
Maximum number of results to return
Optional metadata filters for pre-filtering candidates
When to use hybrid search
Hybrid search excels when you need both: Semantic understanding (dense)- Understanding query intent
- Matching synonyms and paraphrases
- Conceptual similarity
- Product SKUs or model numbers
- Technical specifications
- Legal or medical terminology
- Proper nouns and acronyms
Example scenarios
Sparse vector format
Sparse embeddings use SPLADE models to create sparse vectors that emphasize specific terms, similar to traditional BM25 but with learned term importance. Sparse embeddings are represented as dictionaries mapping token indices to importance weights:Database-specific implementations
Pinecone
Pinecone requires separatesparse_values field for sparse embeddings, distinct from the standard values field used for dense vectors. Native fusion with alpha parameter.
Weaviate
Weaviate uses native BM25 without external embeddings for the sparse component. Hybrid search with configurable fusion weights.Milvus
Milvus supports hybrid search with partition-based isolation and configurable fusion strategies.Qdrant
Qdrant uses payload-based filtering with optimized indexes for hybrid retrieval.Chroma
Chroma provides flexible hybrid search with tenant and database scoping.Fusion strategy comparison
| Strategy | Best for | Parameters |
|---|---|---|
| RRF | Default choice, score normalization | k (default: 60) |
| Alpha weighting | Known source reliability | alpha (0.0-1.0) |
| Weighted merge | Prior knowledge of source quality | weights per source |
Performance tips
- Over-fetch candidates (2-3x top_k) before fusion for better quality
- Use metadata filters to reduce search space before hybrid scoring
- Cache sparse embeddings for repeated queries to reduce latency
- Monitor RRF k parameter impact on result diversity
Related features
Semantic search
Dense vector similarity search
Sparse search
Keyword/lexical matching with SPLADE/BM25
Reranking
Cross-encoder second-stage scoring
Diversity filtering
Post-retrieval redundancy reduction