Using Re-ranking in the Agent Pipeline
Configuration
Threshold
The threshold (0-10) filters out low-relevance chunks:Custom Prompt
Customize how the LLM scores relevance:Built-in Rerankers
Arcana.Agent.Reranker.LLM (Default)
Uses your LLM to score each chunk:
This is the default when you call
Arcana.Agent.rerank/2.
Arcana.Agent.Reranker.ColBERT
ColBERT-style neural reranking using per-token embeddings and MaxSim scoring. Provides more nuanced relevance scoring than single-vector methods by matching individual query tokens to document tokens. Add the optional dependency:mix.exs
:encoder- Pre-loaded Stephen encoder (loads default on first use if not provided):threshold- Minimum score to keep (default: 0.0):top_k- Maximum results to return
- When you need high-quality reranking without LLM latency/cost
- When semantic nuance matters (e.g., technical documentation)
- When you want deterministic, reproducible scores
| Aspect | ColBERT | LLM |
|---|---|---|
| Latency | Fast (local inference) | Slow (API call per chunk) |
| Cost | Free after model load | Per-token API cost |
| Quality | Excellent for semantic similarity | Can understand complex relevance |
| Customization | Fixed model behavior | Custom prompts |
Custom Rerankers
Implementing the Behaviour
Create a custom reranker by implementingArcana.Agent.Reranker:
Inline Function
For simple cases, pass a function directly:When to Use Re-ranking
When re-ranking helps
When re-ranking helps
Re-ranking is most valuable when:
- Your initial search returns many marginally relevant results
- Answer quality suffers from irrelevant context
- You have compute budget for the extra LLM calls (one per chunk)
When to skip re-ranking
When to skip re-ranking
Skip re-ranking when:
- Search already returns highly relevant results
- Latency is critical (adds one LLM call per chunk)
- You’re using a very small result set (limit: 3 or less)
Telemetry
Re-ranking emits telemetry events:Next Steps
Agentic RAG
Build sophisticated pipelines with re-ranking
Evaluation
Measure how re-ranking improves retrieval quality
Search Algorithms
Understand the initial retrieval stage
Telemetry
Monitor re-ranking performance