Search Architecture
Search in Vespa involves coordination between the stateless container layer and content nodes:The Two-Phase Process
Vespa uses a two-phase search process to efficiently handle large-scale queries:Why Two Phases?
Efficiency
Avoid expensive ranking calculations on documents that don’t match
Scalability
Distribute work across content nodes
Flexibility
Use different ranking strategies per phase
Performance
Rank only the most promising candidates
Matching Phase
The matching phase identifies which documents satisfy the query. This happens on each content node independently.Query Types
Text Search
Text Search
Full-text search using inverted indexes:
- Uses reverse indexes built during document indexing
- Supports stemming, phrase matching, and linguistic processing
- Implemented in
searchlib
Structured Queries
Structured Queries
Filtering on attributes:
- Uses attribute (forward) indexes
- Fast numeric and boolean comparisons
- Supports range queries
Vector Search
Vector Search
Approximate nearest neighbor search:
- Uses HNSW index for efficient ANN
- Returns approximate top-k results
- Configurable accuracy vs speed tradeoff
Hybrid Queries
Hybrid Queries
Combine multiple query types:
- Combines text, structured, and vector search
- Efficient query execution with multiple indexes
Query Execution
Query execution is implemented in thesearchlib module:
Key Components:
- Module:
searchlib/src/vespa/searchlib/queryeval - Search iterators for different query operators (AND, OR, NOT, etc.)
- Blueprint pattern for query optimization
- Lazy evaluation for efficiency
Searchlib implements the core matching algorithms used by Proton (the content node server).
Query Language
Vespa supports two primary query languages:YQL (Vespa Query Language)
SQL-like syntax for queries:Simple Query Language
Simpler syntax for basic queries:Matching Operators
Vespa provides various operators for text matching:contains
contains
Basic term matching:
phrase
phrase
Exact phrase matching:
near
near
Terms within a distance:
onear
onear
Ordered terms within a distance:
equiv
equiv
Match any of several terms:
weakAnd
weakAnd
Efficient OR with many terms:
Dispatching and Distribution
The container layer coordinates query execution across content nodes:Scatter-Gather Pattern
Implementation:
container-search module handles query dispatch and result aggregation.
Search Performance
Index Types and Performance
Reverse Index
Fast text search on indexed fields
Attribute Index
Fast filtering and ranking on attributes
HNSW Index
Fast approximate nearest neighbor search
B-tree Index
Fast-search on string attributes
Query Optimization
Query Rewriting
Query Rewriting
Vespa optimizes queries before execution:
- Combining similar terms
- Eliminating redundant clauses
- Choosing optimal execution order
Early Termination
Early Termination
Stop searching when enough results are found:
- Set
hitsparameter for top-k queries - Use
targetHitsfor approximate search - Combine with ranking thresholds
Parallel Execution
Parallel Execution
Leverage multiple content nodes:
- Data is automatically partitioned
- Queries execute in parallel
- Linear scalability with more nodes
Filtering vs Searching
Understanding the difference is key to performance:Filters
Filters use attributes for fast exact matching:- Fast evaluation using forward indexes
- No text processing overhead
- Efficient for numeric and boolean comparisons
Search
Search uses reverse indexes for text matching:- Linguistic processing (stemming, tokenization)
- Relevance scoring
- Phrase and proximity matching
Combined Queries
Best performance comes from combining both:Grouping and Aggregation
Vespa can group and aggregate results during search:Search Implementation Architecture
Content Node (Proton)
The Proton server handles search on content nodes:- Module:
searchcore - Manages document storage and indexes
- Executes matching and first-phase ranking
- Returns top results to container
Search Library
Core search algorithms:- Module:
searchlib - Query evaluation and matching
- Index implementations (reverse, attribute, HNSW)
- Ranking framework (discussed in Ranking concepts)
Real-World Query Example
Here’s a complete hybrid search query:- Matches documents with “machine learning” text (BM25)
- Finds nearest neighbors in embedding space (ANN)
- Filters by publication date
- Ranks using a hybrid profile
- Returns top 20 results
Best Practices
Use Attributes for Filters
Mark filter fields as
attribute in schemaIndex Text Fields
Use
index for fields you’ll search with text queriesSet targetHits
Control ANN search quality vs speed
Combine Query Types
Use hybrid queries for best relevance
Next Steps
Ranking
Learn how documents are scored
Schemas
Configure fields for search
Tensors
Use tensors for semantic search