Skip to main content
Vespa’s architecture consists of three main subsystems that work together to provide a scalable, distributed search and serving platform.

System Overview

Vespa is built around three core subsystems:
1

Stateless Container Layer

Handles incoming requests and routes them to content nodes
2

Content Nodes

Store data and execute queries (matching, ranking, aggregation)
3

Configuration System

Manages configuration and application deployment

The Stateless Container

The stateless container layer is the entry point for all requests to Vespa. It’s built on the jDisc framework and consists entirely of Java components.

Container Components

Provides the foundational request-response handling model:
  • Protocol-independent request processing
  • HTTP and other protocol implementations
  • Application lifecycle management
Module: jdisc_core
Builds on jDisc core with component management:
  • OSGi integration for component bundles
  • Dependency injection framework
  • Metrics and monitoring
  • HTTP connector
Modules: container-disc, component
Query and result processing:
  • Query-Result processing framework (Searchers)
  • Query execution logic and dispatch
  • Scatter-gather across content nodes
  • Grouping and aggregation coordination
Module: container-search

Document Operations

The container layer also handles document write operations:
// Document model - available in both Java and C++
public class Document extends StructuredFieldValue {
    private DocumentId docId;
    private Struct content;
    
    public Document(DocumentType docType, String id) {
        this(docType, new DocumentId(id));
    }
}
Key modules:

Content Nodes

Content nodes are where the data lives and where the heavy lifting of search happens. Written entirely in C++ for performance.

Core Responsibilities

Data Storage

Persistent storage with automatic recovery and replication

Indexing

Maintains forward and reverse indexes in real-time

Matching

Finds documents matching the query criteria

Ranking

Scores documents using configurable rank profiles

Key Components

Proton - The content node server
  • Module: searchcore
  • Core functionality for indexes, matching, storage, and grouping
Search Library
  • Module: searchlib
  • Ranking framework (feature execution)
  • Index and btree implementations
  • Attributes (forward indexes)
  • Java libraries for ranking
Storage System
  • Module: storage
  • Elastic, auto-recovering data storage
  • Distribution and replication across clusters
Evaluation Engine
  • Module: eval
  • Efficient evaluation of ranking expressions
  • Tensor API and operations

Configuration and Administration

The configuration system manages the entire Vespa deployment. Implemented in Java.

Configuration Flow

Key Components

Central configuration management:
  • Receives application deployments
  • Serves configuration to all nodes
  • Validates application packages
Module: configserver
Models the running system:
  • Processes application package into configs
  • Returns config instances by type and ID
  • Validates configuration consistency
Module: config-model
Node-side configuration:
  • Subscribes to configs by type and ID
  • Available in both Java and C++
  • Automatic updates on config changes
Modules: config, config-proxy

Code Map Reference

Vespa consists of approximately 1.7 million lines of code, split equally between Java and C++. The codebase is organized into about 150 modules in a flat structure.
For a complete reference of all modules and their relationships, see the Code Map in the Vespa repository.

General Utility Libraries

  • vespalib - C++ utility library
  • vespajlib - Java utility library (includes tensor implementation)

Request Flow

Here’s how a typical search request flows through Vespa:
1

Request Arrives

Client sends HTTP request to container cluster
2

Query Processing

Container processes query through searcher chain
3

Dispatch

Container dispatches query to relevant content nodes
4

Matching

Content nodes find matching documents
5

Ranking

Content nodes score documents using rank profiles
6

Aggregation

Container aggregates results from all content nodes
7

Response

Formatted results returned to client

Scalability and Distribution

Horizontal Scaling

Add more container or content nodes as needed

Data Distribution

Automatic sharding across content nodes

Replication

Configurable redundancy for high availability

Auto-Recovery

Automatic data redistribution on node failures

Next Steps

Documents

Learn about the document model

Schemas

Define your data structures

Search

Understand how search works

Build docs developers (and LLMs) love