Code Map

Vespa consists of approximately 1.7 million lines of code, split equally between Java and C++. This guide maps the functional elements of Vespa to the most important modules in the flat structure of about 150 modules.

The code is written by a team selected for their ability to do this work unusually well, with time to dedicate to it long-term. While the code is mostly easy to work with, the module structure wasn’t designed to be newcomer-friendly - it’s simply organized in a flat structure.

Architecture Overview

Vespa’s architecture consists of three major subsystems:

Stateless Container

Request handling, query processing, and document operations (Java)

Content Nodes

Data storage, indexing, matching, and ranking (C++)

Configuration

Application deployment, configuration management, and administration (Java)

The Stateless Container

When a request enters Vespa, it first goes through a stateless container cluster called jDisc. The container is implemented entirely in Java and consists of multiple layers:

jDisc Core

Provides the foundation for request-response handling:

jdisc_core

Core jDisc functionality providing:

Application model for running services
Protocol-independent request-response handling
Various protocol implementations
Network I/O abstraction

jDisc Container

Layered on jDisc core, providing component infrastructure:

container-disc

Core container functionality:

Metrics collection
OSGi integration for component bundles
Dependency injection
HTTP connector
Integration between container and core layers

component

The component model - Java components implement or subclass types from this module

Search Container

Layered on jDisc container for query processing:

container-search

Query processing framework including:

Query-Result processing (Searchers)
Generic processing framework
Query profiles
Global query execution logic
Dispatch (scatter-gather)
Grouping and aggregation

Document Operation Modules

Handling document writes and updates:

document

Location: vespa/documentThe document model implemented in both Java and C++:

Documents, fields, and document types
Operations on documents (put, update, remove)
Document serialization

messagebus

Location: vespa/messagebusGeneric async, multi-hop message passing:

Implemented in both Java and C++
Reliable message routing
Load balancing across nodes
Throttling and flow control

documentapi

Location: vespa/documentapiAPI for issuing document operations to Vespa over messagebus:

Document put, update, remove operations
Visiting (bulk read) API
Batch operations

docproc

Location: vespa/docprocChainable document processors:

Process documents before indexing
Transform, enrich, or validate documents
Custom document processing logic

indexinglanguage

Location: vespa/indexinglanguageImplementation of the “indexing” language:

Expressions used in schema indexing: statements
Field transformations
Derived field computation

docprocs

Location: vespa/docprocsDocument processor components bundled with Vespa:

IndexingProcessor - Executes indexing language statements
Standard document transformations

vespaclient-container-plugin

Location: vespa/vespaclient-container-pluginImplements HTTP APIs for document operations:

/document/v1/ REST API
Internal API used by Java HTTP client
Forwards to Document API

vespa-feed-client

Location: vespa/vespa-feed-clientHigh-performance client for writing documents:

Async, pipelined writes
Automatic retries and throttling
Uses internal API for optimal performance

Content Nodes

Content nodes store all data, maintain indexes, and perform distributed query execution. This subsystem is written in C++ for maximum performance.

Core Content Components

searchcore

Location: vespa/searchcoreCore functionality for content nodes:

Proton - The content node server itself
Index maintenance (real-time indexing)
Matching (document selection)
Data storage and retrieval
Grouping and aggregation
Document-level operations

searchlib

Location: vespa/searchlibLibraries invoked by searchcore:Ranking:

Feature execution framework (fef)
Rank feature implementations
Ranking expression evaluation

Indexing:

Index implementations
B-tree structures
Attributes (forward indexes)

Java components:

Java ranking libraries
Query tree representations

storage

Location: vespa/storageElastic, auto-recovering data storage:

Distribution across cluster nodes
Bucket management
Replica maintenance
Consistency guarantees
Garbage collection

eval

Location: vespa/evalEfficient evaluation of ranking expressions:

Tensor API and implementation
Expression optimization
ONNX model integration
SIMD and GPU acceleration

storageapi

Location: vespa/storageapiMessageBus messages for storage:

Document API protocol implementation
Storage operation messages
Internal storage communication

clustercontroller-core

Location: vespa/clustercontroller-coreCluster controller for storage (Java):

Node-level decision-making
State management via ZooKeeper
Cluster health monitoring
Automatic failover

Configuration and Administration

The third major subsystem manages configuration, clusters, and application deployment. Implemented in Java.

Configuration System

configserver

Location: vespa/configserverThe server where applications are deployed:

Application package deployment
Configuration generation
Serving config to nodes
Application lifecycle management

config-model

Location: vespa/config-modelModel of the running system:

Derives configuration from application package
Returns config instances by type and id
Validates application structure
Manages service topology

config

Location: vespa/configClient-side configuration library (Java and C++):

Subscribing to configs by type and id
Reading config payloads
Automatic config updates
Config caching

configgen

Location: vespa/configgenCode generation for configs:

Generates C++ config classes
Generates Java config classes
Type-safe config reading and building

config-proxy

Location: vespa/config-proxyNode-local config proxy:

Caches configs on each node
Reduces config server load
Provides config during restarts

configdefinitions

Location: vespa/configdefinitionsShared config type definitions:

Config .def files
Referenced by multiple modules
System-wide configuration schemas

General Utility Libraries

Libraries used throughout the Vespa codebase:

vespalib

General utility library for C++:

Data structures (hash maps, arrays)
Threading and synchronization
Memory management
String utilities
Network utilities

vespajlib

General utility library for Java:

Collections and data structures
Java tensor implementation
Text processing
Utilities and helpers

Finding Your Way

When working with the Vespa codebase:

Identify the functional area

Determine which subsystem your change affects:

Query processing → Container modules
Indexing/ranking → Content node modules
Configuration/deployment → Config system

Find the relevant module

Use this code map to identify the specific module:

Module names usually indicate their purpose
Check module README files for details
Look at OWNERS files for experts

Explore the module structure

Within each module:

src/main/ or src/vespa/ - Production code
src/test/ or src/tests/ - Test code
README.md - Module documentation
CMakeLists.txt or pom.xml - Build configuration

Module Categories

Modules follow naming patterns:

Pattern	Purpose	Examples
`container-*`	Container components	`container-search`, `container-disc`
`config*`	Configuration system	`configserver`, `config-model`
`search*`	Search and ranking	`searchcore`, `searchlib`
`*-plugin`	Build plugins	`bundle-plugin`, `config-class-plugin`
`vespa*`	Core utilities	`vespalib`, `vespajlib`
`jdisc*`	jDisc framework	`jdisc_core`, `jdisc_http_service`

Additional Resources

TODO List

Larger features nobody is working on yet

Module READMEs

Each module has detailed documentation in its README.md file

OWNERS Files

Find subject matter experts for code areas

Code Search

Search the codebase on GitHub

Not Covered Here

This map focuses on modules you’re most likely to encounter as a developer. Other modules are either:

Small and self-explanatory
Implementing specific technical requirements
Part of the Vespa Cloud service (not expected to be modified externally)

For a complete list, browse the Vespa repository.

Next Steps

Building Vespa

Build the modules you want to work on

Running Tests

Test your changes

Development Overview

Return to development overview

Contributing

Learn the contribution process

Development Guide

Component Development

Contributing

Architecture Overview

Stateless Container

Content Nodes

Configuration

The Stateless Container

jDisc Core

jdisc_core

jDisc Container

container-disc

component

Search Container

container-search

Document Operation Modules

Content Nodes

Core Content Components

Configuration and Administration

Configuration System

General Utility Libraries

vespalib

vespajlib

Finding Your Way

Module Categories

Additional Resources

TODO List

Module READMEs

OWNERS Files

Code Search

Not Covered Here

Next Steps

Building Vespa

Running Tests

Development Overview

Contributing

Build docs developers (and LLMs) love

Development Guide

Component Development

Contributing

​Architecture Overview

Stateless Container

Content Nodes

Configuration

​The Stateless Container

​jDisc Core

jdisc_core

​jDisc Container

container-disc

component

​Search Container

container-search

​Document Operation Modules

​Content Nodes

​Core Content Components

​Configuration and Administration

​Configuration System

​General Utility Libraries

vespalib

vespajlib

​Finding Your Way

​Module Categories

​Additional Resources

TODO List

Module READMEs

OWNERS Files

Code Search

​Not Covered Here

​Next Steps

Building Vespa

Running Tests

Development Overview

Contributing

Build docs developers (and LLMs) love

Architecture Overview

The Stateless Container

jDisc Core

jDisc Container

Search Container

Document Operation Modules

Content Nodes

Core Content Components

Configuration and Administration

Configuration System

General Utility Libraries

Finding Your Way

Module Categories

Additional Resources

Not Covered Here

Next Steps