What is a Document?
A document is an identifiable set of value bindings of a document type. Think of it as a structured record with:- A unique document ID
- A document type defining its structure
- A set of fields containing values
Documents in Vespa are implemented in both Java and C++, allowing them to be used throughout the system from the container layer to content nodes.
Document Structure
Here’s the Java implementation showing the core document structure:Document ID
Every document must have a unique identifier. Document IDs follow this format:Examples
Document Types
A document type defines the structure and fields of documents. It’s similar to a table schema in relational databases.Document Type Features
Inheritance
Document types can inherit from other types
Field Sets
Group fields for efficient partial updates
Struct Types
Nested structured data within documents
Imported Fields
Reference fields from other document types
Fields
Fields are the individual data elements within a document. Each field has:- A name (identifier)
- A data type (string, int, tensor, etc.)
- Optional attributes for indexing and storage behavior
Field Types
Primitive Types
Primitive Types
Basic scalar values:
string- Text dataint,long- Integer numbersfloat,double- Floating-point numbersbool- Boolean valuesbyte- Single byte values
Complex Types
Complex Types
Structured data:
array<T>- Ordered collection of valuesweightedset<T>- Set with associated weightsmap<K,V>- Key-value pairstensor<T>(dimensions)- Multi-dimensional arraysstruct- Custom structured types
Reference Types
Reference Types
Links to other documents:
reference<document-type>- Link to another document
Working with Documents
Creating Documents
Reading Documents
Document Operations
Document Serialization
Vespa supports multiple serialization formats:JSON Format
Document Inheritance
Document types support inheritance for code reuse:Best Practices
Choose IDs Carefully
Use meaningful, stable identifiers that won’t change
Plan Field Types
Select appropriate data types for your use case
Use Inheritance
Share common fields across related document types
Partial Updates
Update only changed fields for better performance
Document Module Reference
The document module contains the core document implementation:- Module:
document - Language: Java and C++
- Key Classes:
Document- Main document classDocumentType- Document type definitionField- Field definitionDocumentId- Document identifierFieldValue- Field value abstraction
Next Steps
Schemas
Define document structures in schema files
Document Operations
Use the Document API
Indexing
Learn about indexing fields