Skip to main content
Vespa search is a distributed, two-phase process that efficiently finds and ranks documents at scale. Understanding how search works helps you optimize performance and relevance.

Search Architecture

Search in Vespa involves coordination between the stateless container layer and content nodes:

The Two-Phase Process

Vespa uses a two-phase search process to efficiently handle large-scale queries:
1

Matching Phase

Find all documents matching the query criteria
2

Ranking Phase

Score and sort matched documents

Why Two Phases?

Efficiency

Avoid expensive ranking calculations on documents that don’t match

Scalability

Distribute work across content nodes

Flexibility

Use different ranking strategies per phase

Performance

Rank only the most promising candidates

Matching Phase

The matching phase identifies which documents satisfy the query. This happens on each content node independently.

Query Types

Filtering on attributes:
select * from product where price < 100 and in_stock = true
  • Uses attribute (forward) indexes
  • Fast numeric and boolean comparisons
  • Supports range queries
Combine multiple query types:
select * from article 
where 
    userQuery() 
    and published_date > 1609459200
    and {targetHits:20}nearestNeighbor(embedding, query_embedding)
  • Combines text, structured, and vector search
  • Efficient query execution with multiple indexes

Query Execution

Query execution is implemented in the searchlib module: Key Components:
Searchlib implements the core matching algorithms used by Proton (the content node server).

Query Language

Vespa supports two primary query languages:

YQL (Vespa Query Language)

SQL-like syntax for queries:
select * from music where title contains "love"

Simple Query Language

Simpler syntax for basic queries:
query=laptop&filter=price:<1000

Matching Operators

Vespa provides various operators for text matching:
Basic term matching:
where title contains "vespa"
Exact phrase matching:
where title contains phrase("search", "engine")
Terms within a distance:
where title contains near("search", "engine")
Ordered terms within a distance:
where title contains onear("search", "engine")
Match any of several terms:
where title contains equiv("car", "automobile", "vehicle")
Efficient OR with many terms:
where title contains weakAnd("machine", "learning", "AI", "neural")

Dispatching and Distribution

The container layer coordinates query execution across content nodes:

Scatter-Gather Pattern

1

Query Dispatch

Container sends query to all content nodes covering the data
2

Parallel Execution

Each content node executes the query on its data partition
3

Partial Results

Each node returns its top-k results
4

Result Merging

Container merges results into final ranking
Implementation: container-search module handles query dispatch and result aggregation.

Search Performance

Index Types and Performance

Reverse Index

Fast text search on indexed fields

Attribute Index

Fast filtering and ranking on attributes

HNSW Index

Fast approximate nearest neighbor search

B-tree Index

Fast-search on string attributes

Query Optimization

Vespa optimizes queries before execution:
  • Combining similar terms
  • Eliminating redundant clauses
  • Choosing optimal execution order
Stop searching when enough results are found:
  • Set hits parameter for top-k queries
  • Use targetHits for approximate search
  • Combine with ranking thresholds
Leverage multiple content nodes:
  • Data is automatically partitioned
  • Queries execute in parallel
  • Linear scalability with more nodes

Filtering vs Searching

Understanding the difference is key to performance:

Filters

Filters use attributes for fast exact matching:
where price < 100 and category = "electronics"
  • Fast evaluation using forward indexes
  • No text processing overhead
  • Efficient for numeric and boolean comparisons
Search uses reverse indexes for text matching:
where title contains "laptop"
  • Linguistic processing (stemming, tokenization)
  • Relevance scoring
  • Phrase and proximity matching

Combined Queries

Best performance comes from combining both:
select * from product 
where 
    title contains "laptop" and  -- Search
    price < 1000 and             -- Filter
    in_stock = true              -- Filter

Grouping and Aggregation

Vespa can group and aggregate results during search:
select * from product 
where category contains "electronics"
| all(group(brand) each(output(count())))
This returns counts per brand without retrieving all documents. Implementation: Grouping happens on content nodes before results are sent to the container, minimizing data transfer.

Search Implementation Architecture

Content Node (Proton)

The Proton server handles search on content nodes:
  • Module: searchcore
  • Manages document storage and indexes
  • Executes matching and first-phase ranking
  • Returns top results to container

Search Library

Core search algorithms:
  • Module: searchlib
  • Query evaluation and matching
  • Index implementations (reverse, attribute, HNSW)
  • Ranking framework (discussed in Ranking concepts)

Real-World Query Example

Here’s a complete hybrid search query:
{
  "yql": "select * from article where userQuery() and {targetHits:10}nearestNeighbor(embedding, query_embedding)",
  "query": "machine learning",
  "ranking": "semantic_bm25_hybrid",
  "input.query(query_embedding)": [0.12, -0.45, 0.78, ...],
  "filter": "published_date > 1609459200",
  "hits": 20
}
This query:
  1. Matches documents with “machine learning” text (BM25)
  2. Finds nearest neighbors in embedding space (ANN)
  3. Filters by publication date
  4. Ranks using a hybrid profile
  5. Returns top 20 results

Best Practices

Use Attributes for Filters

Mark filter fields as attribute in schema

Index Text Fields

Use index for fields you’ll search with text queries

Set targetHits

Control ANN search quality vs speed

Combine Query Types

Use hybrid queries for best relevance

Next Steps

Ranking

Learn how documents are scored

Schemas

Configure fields for search

Tensors

Use tensors for semantic search

Build docs developers (and LLMs) love