Tensors are Vespa’s native support for multi-dimensional arrays. They’re essential for machine learning, semantic search, and advanced ranking operations.
What are Tensors?
A tensor is a multi-dimensional array that can be used in computations. In Vespa:
Tensors have named dimensions
Each dimension can be sparse (mapped) or dense (indexed)
Cells contain scalar values (float, double, int8, bfloat16)
Operations are optimized for machine learning workloads
Vespa’s tensor implementation is available in both Java (vespajlib ) and C++ (eval ) for use throughout the system.
Tensor Implementation
Here’s the core Tensor interface from the Java implementation:
package com.yahoo.tensor;
/**
* A multidimensional array which can be used in computations.
*
* A tensor consists of a set of dimension names and a set of cells
* containing scalar values. Each cell is identified by its address,
* which consists of a set of dimension-label pairs.
*/
public interface Tensor {
/** Returns the type of this tensor */
TensorType type ();
/** Returns whether this has any cells */
default boolean isEmpty () { return size () == 0 ; }
/** Returns the number of cells in this tensor */
default long size () {
return sizeAsInt ();
}
/** Returns the value of a cell, or 0.0 if this cell does not exist */
double get ( TensorAddress address );
/** Returns true if this cell exists */
boolean has ( TensorAddress address );
/** Returns the cells of this in some undefined order */
Iterator < Cell > cellIterator ();
}
From vespajlib/src/main/java/com/yahoo/tensor/Tensor.java:54
Tensor Types
Dense Tensors (Indexed)
Dense tensors have dimensions with integer indices:
Dense tensors are stored contiguously in memory - efficient for embeddings and neural network layers.
Sparse Tensors (Mapped)
Sparse tensors have dimensions with string labels:
tensor < float >( user {}, item {})
Sparse tensors only store non-zero cells - efficient for very large, sparse data.
Mixed Tensors
Mixed tensors combine indexed and mapped dimensions:
tensor < float >( user {}, feature [ 100 ])
Useful for per-user embeddings or similar use cases.
Tensor Cell Types
Control memory usage and precision:
float 32-bit floating point (default)
double 64-bit floating point
bfloat16 16-bit brain float (ML optimized)
int8 8-bit integer (quantized)
Example with different cell types:
tensor < bfloat16 >( x [ 768 ]) // Half the memory of float
tensor < int8 >( x [ 768 ]) // 1/4 the memory of float
Using Tensors in Schemas
Embedding Fields
Store vector embeddings as tensor fields:
schema article {
document article {
field title type string {
indexing: index | summary
}
field title_embedding type tensor < float >(x[768]) {
indexing: attribute
}
field body_embedding type tensor < float >(x[768]) {
indexing: attribute | index
attribute {
distance-metric: angular
}
index {
hnsw {
max - links - per - node : 16
neighbors - to - explore - at - insert : 200
}
}
}
}
}
Based on tensor fields in msmarco.sd:27-49
Distance Metrics
When indexing tensors for nearest neighbor search:
Cosine distance (angular separation): attribute {
distance - metric : angular
}
Best for normalized embeddings.
L2 distance: attribute {
distance - metric : euclidean
}
Standard Euclidean distance.
Negative dot product: attribute {
distance - metric : dotproduct
}
For maximum inner product search.
Hamming distance: attribute {
distance - metric : hamming
}
For binary vectors.
Tensor Operations
Vespa provides rich tensor operations available from the Java Tensor class:
import com.yahoo.tensor.functions. * ;
public interface Tensor {
// Arithmetic operations
Tensor map ( DoubleUnaryOperator mapper );
Tensor join ( Tensor other , DoubleBinaryOperator combiner );
Tensor reduce ( Reduce . Aggregator aggregator , String ... dimensions );
// Linear algebra
Tensor matmul ( Tensor other , String dimension );
// Tensor manipulation
Tensor rename ( String fromDimension , String toDimension );
Tensor concat ( Tensor other , String dimension );
}
From Tensor.java:1-23
Common Operations
Apply operations to each cell: tensor1 + tensor2 // Addition
tensor1 * tensor2 // Multiplication
tensor1 - tensor2 // Subtraction
tensor1 / tensor2 // Division
pow ( tensor1 , 2 ) // Power
exp ( tensor1 ) // Exponential
Aggregate across dimensions: sum ( tensor , dim ) // Sum
max ( tensor , dim ) // Maximum
min ( tensor , dim ) // Minimum
avg ( tensor , dim ) // Average
count ( tensor , dim ) // Count
Linear algebra: tensor1 * tensor2 // Dot product (when appropriate)
matmul ( tensor1 , tensor2 , dim ) // Matrix multiplication
Advanced operations: concat ( tensor1 , tensor2 , dim ) // Concatenation
rename ( tensor , from , to ) // Rename dimension
expand ( tensor , dim ) // Add dimension
Tensors in Ranking
Semantic Search Example
Compute similarity between query and document embeddings:
rank - profile semantic {
function similarity () {
expression : sum ( query ( query_embedding ) * attribute ( doc_embedding ))
}
first - phase {
expression : similarity ()
}
}
From msmarco.sd:73-88
The * operator performs element-wise multiplication, and sum() reduces to a scalar score.
Multi-Field Embeddings
Combine embeddings from multiple fields:
rank - profile multi_field_semantic {
function title_similarity () {
expression : sum ( query ( query_embedding ) * attribute ( title_embedding ))
}
function body_similarity () {
expression : sum ( query ( query_embedding ) * attribute ( body_embedding ))
}
first - phase {
expression : 2.0 * title_similarity () + body_similarity ()
}
}
From msmarco.sd:73-88
Tensor Literal Syntax
Create tensors directly in expressions:
Dense Vector
Sparse
Matrix
Mixed
tensor < float >( x [ 3 ]):[ 1.0 , 2.0 , 3.0 ]
Tensor Use Cases
1. Semantic Search
Store and query document embeddings:
field embedding type tensor < float >(x[768]) {
indexing: attribute | index
attribute {
distance-metric: angular
}
index {
hnsw {
max - links - per - node : 16
neighbors - to - explore - at - insert : 200
}
}
}
Query:
select * from documents
where {targetHits: 10 }nearestNeighbor(embedding, query_embedding)
2. Neural Network Inference
Store model weights as tensors:
function neural_net () {
expression {
xw ( matmul ( attribute ( input_features ), constant ( weights_layer1 )),
constant ( bias_layer1 ),
"relu" )
}
}
3. Personalized Ranking
Per-user feature vectors:
field user_preferences type tensor < float >(category[50]) {
indexing: attribute
}
field product_features type tensor < float >(category[50]) {
indexing: attribute
}
rank - profile personalized {
first - phase {
expression : sum ( query ( user_prefs ) * attribute ( product_features ))
}
}
4. Collaborative Filtering
Sparse user-item matrices:
field interactions type tensor < float >(user{}, item {}) {
indexing : attribute
}
Optimization Tips
Choose Right Type Use dense for embeddings, sparse for categorical data
Use Appropriate Precision Consider bfloat16 or int8 for large tensors
Index for ANN Add HNSW index for nearest neighbor queries
Minimize Dimension Smaller embeddings = faster computations
Memory Usage
Tensor memory depends on cell type and dimensions:
tensor<float>(x[768]) = 768 * 4 bytes = 3 KB
tensor<bfloat16>(x[768]) = 768 * 2 bytes = 1.5 KB
tensor<int8>(x[768]) = 768 * 1 byte = 768 bytes
For 1M documents:
float: 3 GB
bfloat16: 1.5 GB
int8: 768 MB
Feeding Tensor Data
Send tensor values when feeding documents:
{
"put" : "id:article:article::123" ,
"fields" : {
"title" : "Understanding Vespa Tensors" ,
"embedding" : {
"values" : [ 0.12 , -0.45 , 0.78 , ... ]
}
}
}
Tensor Evaluation Engine
The evaluation engine optimizes tensor operations:
Module : eval
Compiles tensor expressions into efficient code
Supports multiple backends (CPU, GPU)
Automatic optimization based on tensor types
The eval module provides efficient evaluation of ranking expressions and tensor operations on content nodes.
Advanced Tensor Operations
Matrix Multiplication
function matrix_mult () {
expression : sum ( attribute ( matrix1 ) * attribute ( matrix2 ), common_dimension )
}
Normalization
function l2_normalize () {
expression : attribute ( vector ) / sqrt ( sum ( pow ( attribute ( vector ), 2 )))
}
Softmax
function softmax () {
expression {
exp ( attribute ( logits )) / sum ( exp ( attribute ( logits )))
}
}
Cosine Similarity
function cosine_similarity () {
expression {
sum ( query ( q ) * attribute ( d )) /
( sqrt ( sum ( pow ( query ( q ), 2 ))) * sqrt ( sum ( pow ( attribute ( d ), 2 ))))
}
}
Best Practices
Start Simple Begin with basic embeddings, add complexity as needed
Profile Memory Monitor tensor memory usage in production
Test Precision Validate that bfloat16/int8 maintains quality
Index Large Tensors Use HNSW for tensors with >100 dimensions
Next Steps
Ranking Use tensors in ranking expressions
Schemas Define tensor fields
Search Query tensor fields