Skip to main content
Tensors are Vespa’s native support for multi-dimensional arrays. They’re essential for machine learning, semantic search, and advanced ranking operations.

What are Tensors?

A tensor is a multi-dimensional array that can be used in computations. In Vespa:
  • Tensors have named dimensions
  • Each dimension can be sparse (mapped) or dense (indexed)
  • Cells contain scalar values (float, double, int8, bfloat16)
  • Operations are optimized for machine learning workloads
Vespa’s tensor implementation is available in both Java (vespajlib) and C++ (eval) for use throughout the system.

Tensor Implementation

Here’s the core Tensor interface from the Java implementation:
package com.yahoo.tensor;

/**
 * A multidimensional array which can be used in computations.
 * 
 * A tensor consists of a set of dimension names and a set of cells 
 * containing scalar values. Each cell is identified by its address, 
 * which consists of a set of dimension-label pairs.
 */
public interface Tensor {
    /** Returns the type of this tensor */
    TensorType type();
    
    /** Returns whether this has any cells */
    default boolean isEmpty() { return size() == 0; }
    
    /** Returns the number of cells in this tensor */
    default long size() {
        return sizeAsInt();
    }
    
    /** Returns the value of a cell, or 0.0 if this cell does not exist */
    double get(TensorAddress address);
    
    /** Returns true if this cell exists */
    boolean has(TensorAddress address);
    
    /** Returns the cells of this in some undefined order */
    Iterator<Cell> cellIterator();
}
From vespajlib/src/main/java/com/yahoo/tensor/Tensor.java:54

Tensor Types

Dense Tensors (Indexed)

Dense tensors have dimensions with integer indices:
tensor<float>(x[768])
Dense tensors are stored contiguously in memory - efficient for embeddings and neural network layers.

Sparse Tensors (Mapped)

Sparse tensors have dimensions with string labels:
tensor<float>(user{}, item{})
Sparse tensors only store non-zero cells - efficient for very large, sparse data.

Mixed Tensors

Mixed tensors combine indexed and mapped dimensions:
tensor<float>(user{}, feature[100])
Useful for per-user embeddings or similar use cases.

Tensor Cell Types

Control memory usage and precision:

float

32-bit floating point (default)

double

64-bit floating point

bfloat16

16-bit brain float (ML optimized)

int8

8-bit integer (quantized)
Example with different cell types:
tensor<bfloat16>(x[768])  // Half the memory of float
tensor<int8>(x[768])      // 1/4 the memory of float

Using Tensors in Schemas

Embedding Fields

Store vector embeddings as tensor fields:
schema article {
    document article {
        field title type string {
            indexing: index | summary
        }
        
        field title_embedding type tensor<float>(x[768]) {
            indexing: attribute
        }
        
        field body_embedding type tensor<float>(x[768]) {
            indexing: attribute | index
            attribute {
                distance-metric: angular
            }
            index {
                hnsw {
                    max-links-per-node: 16
                    neighbors-to-explore-at-insert: 200
                }
            }
        }
    }
}
Based on tensor fields in msmarco.sd:27-49

Distance Metrics

When indexing tensors for nearest neighbor search:
Cosine distance (angular separation):
attribute {
    distance-metric: angular
}
Best for normalized embeddings.
L2 distance:
attribute {
    distance-metric: euclidean
}
Standard Euclidean distance.
Negative dot product:
attribute {
    distance-metric: dotproduct
}
For maximum inner product search.
Hamming distance:
attribute {
    distance-metric: hamming
}
For binary vectors.

Tensor Operations

Vespa provides rich tensor operations available from the Java Tensor class:
import com.yahoo.tensor.functions.*;

public interface Tensor {
    // Arithmetic operations
    Tensor map(DoubleUnaryOperator mapper);
    Tensor join(Tensor other, DoubleBinaryOperator combiner);
    Tensor reduce(Reduce.Aggregator aggregator, String... dimensions);
    
    // Linear algebra
    Tensor matmul(Tensor other, String dimension);
    
    // Tensor manipulation
    Tensor rename(String fromDimension, String toDimension);
    Tensor concat(Tensor other, String dimension);
}
From Tensor.java:1-23

Common Operations

Apply operations to each cell:
tensor1 + tensor2           // Addition
tensor1 * tensor2           // Multiplication
tensor1 - tensor2           // Subtraction
tensor1 / tensor2           // Division
pow(tensor1, 2)            // Power
exp(tensor1)               // Exponential
Aggregate across dimensions:
sum(tensor, dim)           // Sum
max(tensor, dim)           // Maximum
min(tensor, dim)           // Minimum
avg(tensor, dim)           // Average
count(tensor, dim)         // Count
Linear algebra:
tensor1 * tensor2          // Dot product (when appropriate)
matmul(tensor1, tensor2, dim)  // Matrix multiplication
Advanced operations:
concat(tensor1, tensor2, dim)  // Concatenation
rename(tensor, from, to)       // Rename dimension
expand(tensor, dim)            // Add dimension

Tensors in Ranking

Semantic Search Example

Compute similarity between query and document embeddings:
rank-profile semantic {
    function similarity() {
        expression: sum(query(query_embedding) * attribute(doc_embedding))
    }
    
    first-phase {
        expression: similarity()
    }
}
From msmarco.sd:73-88 The * operator performs element-wise multiplication, and sum() reduces to a scalar score.

Multi-Field Embeddings

Combine embeddings from multiple fields:
rank-profile multi_field_semantic {
    function title_similarity() {
        expression: sum(query(query_embedding) * attribute(title_embedding))
    }
    
    function body_similarity() {
        expression: sum(query(query_embedding) * attribute(body_embedding))
    }
    
    first-phase {
        expression: 2.0 * title_similarity() + body_similarity()
    }
}
From msmarco.sd:73-88

Tensor Literal Syntax

Create tensors directly in expressions:
tensor<float>(x[3]):[1.0, 2.0, 3.0]

Tensor Use Cases

Store and query document embeddings:
field embedding type tensor<float>(x[768]) {
    indexing: attribute | index
    attribute {
        distance-metric: angular
    }
    index {
        hnsw {
            max-links-per-node: 16
            neighbors-to-explore-at-insert: 200
        }
    }
}
Query:
select * from documents 
where {targetHits:10}nearestNeighbor(embedding, query_embedding)

2. Neural Network Inference

Store model weights as tensors:
function neural_net() {
    expression {
        xw(matmul(attribute(input_features), constant(weights_layer1)),
           constant(bias_layer1),
           "relu")
    }
}

3. Personalized Ranking

Per-user feature vectors:
field user_preferences type tensor<float>(category[50]) {
    indexing: attribute
}

field product_features type tensor<float>(category[50]) {
    indexing: attribute
}

rank-profile personalized {
    first-phase {
        expression: sum(query(user_prefs) * attribute(product_features))
    }
}

4. Collaborative Filtering

Sparse user-item matrices:
field interactions type tensor<float>(user{}, item{}) {
    indexing: attribute
}

Tensor Performance

Optimization Tips

Choose Right Type

Use dense for embeddings, sparse for categorical data

Use Appropriate Precision

Consider bfloat16 or int8 for large tensors

Index for ANN

Add HNSW index for nearest neighbor queries

Minimize Dimension

Smaller embeddings = faster computations

Memory Usage

Tensor memory depends on cell type and dimensions:
tensor<float>(x[768])    = 768 * 4 bytes  = 3 KB
tensor<bfloat16>(x[768]) = 768 * 2 bytes  = 1.5 KB
tensor<int8>(x[768])     = 768 * 1 byte   = 768 bytes
For 1M documents:
  • float: 3 GB
  • bfloat16: 1.5 GB
  • int8: 768 MB

Feeding Tensor Data

Send tensor values when feeding documents:
{
  "put": "id:article:article::123",
  "fields": {
    "title": "Understanding Vespa Tensors",
    "embedding": {
      "values": [0.12, -0.45, 0.78, ...]
    }
  }
}

Tensor Evaluation Engine

The evaluation engine optimizes tensor operations:
  • Module: eval
  • Compiles tensor expressions into efficient code
  • Supports multiple backends (CPU, GPU)
  • Automatic optimization based on tensor types
The eval module provides efficient evaluation of ranking expressions and tensor operations on content nodes.

Advanced Tensor Operations

Matrix Multiplication

function matrix_mult() {
    expression: sum(attribute(matrix1) * attribute(matrix2), common_dimension)
}

Normalization

function l2_normalize() {
    expression: attribute(vector) / sqrt(sum(pow(attribute(vector), 2)))
}

Softmax

function softmax() {
    expression {
        exp(attribute(logits)) / sum(exp(attribute(logits)))
    }
}

Cosine Similarity

function cosine_similarity() {
    expression {
        sum(query(q) * attribute(d)) / 
        (sqrt(sum(pow(query(q), 2))) * sqrt(sum(pow(attribute(d), 2))))
    }
}

Best Practices

Start Simple

Begin with basic embeddings, add complexity as needed

Profile Memory

Monitor tensor memory usage in production

Test Precision

Validate that bfloat16/int8 maintains quality

Index Large Tensors

Use HNSW for tensors with >100 dimensions

Next Steps

Ranking

Use tensors in ranking expressions

Schemas

Define tensor fields

Search

Query tensor fields

Build docs developers (and LLMs) love