Overview
With multi-vector search, you can:- Store a matrix of embeddings per document (e.g., one per token)
- Use MaxSim (maximum similarity) for late-interaction retrieval
- Support various matrix value types (f32, f16, f8, u8, i8)
- Optimize with quantization and sketch-based indexing
- Control candidate selection for better speed/accuracy tradeoff
Schema Setup
Define a matrix field with amultiVectorIndex():
Matrix Value Types
TopK supports multiple matrix value types:f32- 32-bit floating point (standard precision)f16- 16-bit floating point (half precision)f8- 8-bit floating pointu8- 8-bit unsigned integeri8- 8-bit signed integer
Index Options
Customize the multi-vector index behavior:MaxSim is currently the only supported metric for multi-vector search. It computes the maximum similarity between each query vector and all document vectors, then sums these maximum similarities.
Inserting Documents with Matrices
Provide embeddings as a matrix (array of arrays):Querying with Multi-Vector Search
Usefn.multiVectorDistance() to compute MaxSim scores:
Controlling Candidates
Limit the number of candidate vectors considered for better performance:Using Explicit Matrix Types
For non-f32 matrix types, use the explicit matrix constructor:Combining with Filters
Apply filters before multi-vector search:Use Cases
Multi-vector search is ideal for:- Token-level embeddings: Store embeddings for each token in a document
- ColBERT-style models: Late interaction models that benefit from MaxSim
- Multi-representation documents: Documents with multiple semantic aspects
- Fine-grained matching: Match specific parts of documents rather than whole-document embeddings
The matrix dimension parameter specifies the length of each individual vector (number of columns), not the number of vectors. The number of vectors (rows) can vary per document.
Related Concepts
- Vector Search - Single vector per document
- True Hybrid Search - Combine with other search types
- Reranking - Improve multi-vector search results