Overview
Zvec provides a flexible reranking framework through theRerankFunction base class and built-in implementations:
- RrfReRanker: Reciprocal Rank Fusion (position-based)
- WeightedReRanker: Weighted score combination
- Custom rerankers: Model-based or domain-specific
RerankFunction Base Class
All rerankers inherit fromRerankFunction:
Parameters
- topn (int): Number of documents to return after reranking
- rerank_field (str, optional): Field name for reranking input (model-based rerankers)
RrfReRanker (Reciprocal Rank Fusion)
Overview
RRF combines results from multiple vector queries using position-based scoring. It’s effective when relevance scores across different vector fields aren’t directly comparable.How It Works
For each document at rankr in a result list, RRF assigns a score:
kis the rank constant (smoothing parameter)ris the zero-based rank position
Configuration
Parameters
topn (int)
Number of top documents to return after reranking.
Default: 10
rank_constant (int)
Smoothing constant k in the RRF formula.
- Higher values: Reduce impact of rank position differences
- Lower values: Emphasize early ranks more strongly
60Range: Typically 10-100
Example Usage
When to Use RRF
- Multi-modal search: Combining text, image, audio embeddings
- Hybrid retrieval: Dense + sparse vectors
- Cross-lingual search: Different language embeddings
- Domain fusion: Combining domain-specific embedding spaces
Advantages
- Score-independent: Doesn’t require normalized or comparable scores
- Robust: Works well across different embedding types
- Simple: No tuning beyond rank_constant
Disadvantages
- Position-only: Ignores actual relevance scores
- Equal weighting: All fields contribute equally (unless using weighted variant)
WeightedReRanker
Overview
WeightedReRanker combines scores from multiple vector fields using configurable weights. Scores are normalized based on the distance metric, then weighted and summed.Configuration
Parameters
topn (int)
Number of documents to return.
Default: 10
metric (MetricType)
Distance metric used for score normalization.
Supported:
MetricType.L2: Euclidean distanceMetricType.IP: Inner productMetricType.COSINE: Cosine similarity
MetricType.L2
weights (dict[str, float])
Weight per vector field. Fields not specified default to weight 1.0.
Default: None (all fields weighted equally)
Score Normalization
WeightedReRanker normalizes scores to [0, 1] based on metric type:Example Usage
When to Use WeightedReRanker
- Known importance hierarchy: Title > Content > Metadata
- Score-based fusion: When raw scores are meaningful
- Fine-tuned weights: Domain-specific weighting strategies
Advantages
- Flexible weighting: Control contribution of each field
- Score-aware: Utilizes actual relevance scores
- Interpretable: Clear how each field contributes
Disadvantages
- Requires tuning: Optimal weights depend on data and use case
- Metric-dependent: Must specify correct normalization metric
- Score calibration: Assumes scores are reasonably calibrated
Model-Based Reranking
Custom Reranker Implementation
For advanced scenarios, implement a custom reranker using neural models:Usage
Comparison Table
| Reranker | Use Case | Input | Complexity | Accuracy |
|---|---|---|---|---|
| RrfReRanker | Multi-modal, hybrid | Rank positions | Low | Medium |
| WeightedReRanker | Known field importance | Scores + weights | Medium | Medium-High |
| Model-Based | Maximum accuracy | Query + doc text | High | Very High |
Best Practices
1. Over-Fetch Then Rerank
Retrieve more candidates than needed, then rerank:2. Tune Weights with Validation Data
For WeightedReRanker, optimize weights on labeled data:3. Use RRF as Baseline
Start with RrfReRanker, then explore weighted/model-based:4. Combine with Quantization
Use aggressive quantization for fast retrieval, then rerank:See Also
- Multi-Vector Queries - Query multiple vector fields
- Index Types - Choose the right index
- Performance Tuning - Optimize reranking speed