Overview
Graph search in GraphRAG combines vector similarity and graph traversal to retrieve more relevant context than traditional RAG. The core technique is Reciprocal Rank Fusion (RRF) - a method for merging ranked lists from multiple sources.How Graph Search Works
Complete Pipeline
lib/arcana/graph/fusion_search.ex:129
Graph Search (Without Vector)
Pure graph-based retrieval without vector search:How it works:
- Find matching entities in the graph by name
- Traverse relationships up to
depthhops - Collect chunks connected to discovered entities
- Return unique chunks containing relevant entities
lib/arcana/graph/fusion_search.ex:100
Example: 2-hop traversal
Fusion Search (Vector + Graph)
Combines vector search and graph search using RRF:lib/arcana/graph/fusion_search.ex:142
Options
:depth (integer, default: 1)
- How many relationship hops to traverse
- Higher depth = more entities, broader context
- Typical range: 1-3
:limit (integer, default: 10)
- Maximum number of results to return
- Final results after RRF fusion
:k (integer, default: 60)
- RRF constant to reduce high-rank impact
- Higher k = more balanced fusion
- Lower k = favor top-ranked items
- Typical range: 10-100
Reciprocal Rank Fusion (RRF)
What is RRF?
RRF is a rank aggregation method that combines multiple ranked lists into a single ranking. Formula:kis a constant (default: 60)rank(document, list_i)is the position in list i (1-based)- Sum across all lists containing the document
Example Calculation
lib/arcana/graph/fusion_search.ex:42
Why RRF Works
✅ Promotes agreement - Documents in multiple lists score higher ✅ Robust to outliers - Bad ranking in one list doesn’t eliminate a document ✅ No score normalization - Works with any ranking method (no need to normalize scores) ✅ Simple & effective - Beats weighted averaging in most benchmarksGraph Traversal
Find related entities by following relationships:lib/arcana/graph/graph_query.ex:130
Traversal Algorithm
lib/arcana/graph/graph_query.ex:193
Finding Entities
By Name
lib/arcana/graph/graph_query.ex:76
By Embedding
Find entities similar to a query embedding:lib/arcana/graph/graph_query.ex:102
Getting Chunks for Entities
Retrieve all chunks mentioning specific entities:lib/arcana/graph/graph_query.ex:144
Real Examples from Source
Example 1: RRF Implementation
Fromlib/arcana/graph/fusion_search.ex:57:
Example 2: Graph Search
Fromlib/arcana/graph/fusion_search.ex:100:
Example 3: Fusion Search
Fromlib/arcana/graph/fusion_search.ex:142:
Example 4: BFS Traversal
Fromlib/arcana/graph/graph_query.ex:193:
Integration with Arcana
GraphRAG integrates seamlessly with standard Arcana operations:Performance Considerations
Graph Search:- Small graphs (< 100 entities): ~1-10ms
- Medium graphs (100-1000 entities): ~10-100ms
- Large graphs (1000-10000 entities): ~100-1000ms
- Depth impact: O(depth × avg_degree)
- Very fast: O(n log n) where n = total unique documents
- Typical: ~1-5ms for 20-100 documents
- Cache graph structure - Build once, query many times
- Index adjacency lists - Use maps for O(1) neighbor lookup
- Limit depth - Depth 1-2 is usually sufficient
- Early termination - Stop traversal when enough chunks found
- Parallel execution - Run vector and graph search concurrently
When to Use Graph Search
Best Use Cases
✅ Multi-hop questions- “Who works at companies funded by Y Combinator?”
- Requires traversing: Person → Company → Investor
- “Everything about Sam Altman”
- Traverse all relationships from one entity
- “How is OpenAI connected to Microsoft?”
- Find shortest path between entities
- Technical docs with many named components
- Research papers with authors, institutions, citations
When Vector-Only is Better
❌ Abstract/semantic queries- “What are best practices for caching?”
- No specific entities to anchor on
- “How do I configure logging?”
- No entities extracted, graph search returns nothing
- Creative writing, narratives
- Few named entities or relationships
Next Steps
- Entity Extraction - Configure entity extractors
- Relationships - Build the graph structure
- Communities - Use community summaries for global queries
- GraphRAG Overview - Understand the complete pipeline