Overview
The distance module provides optimized implementations of common distance and similarity metrics for vector operations. All functions are inlined for maximum performance.Main Dispatch Function
compute
Main dispatch function called by HNSW for all distance computations.First vector.
Second vector (must have same dimension as
a).Distance metric to use:
Cosine, Euclidean, or DotProduct.Returns the computed distance/similarity value.
Similarity Metrics
cosine_similarity
Cosine similarity — returns value in [-1, 1], higher = more similar. This is the DEFAULT metric (same as Pinecone default).First vector.
Second vector.
Cosine similarity in range [-1, 1]:
1.0: Identical direction0.0: Orthogonal (perpendicular)-1.0: Opposite direction
dot_product
Dot product similarity — returns scalar, higher = more similar. Best for normalized vectors (OpenAI embeddings are already normalized).First vector.
Second vector.
Dot product value. For normalized vectors, equivalent to cosine similarity but faster.
Distance Metrics
euclidean_distance
Euclidean distance — returns value in [0, ∞), lower = more similar.First vector.
Second vector.
Euclidean (L2) distance between vectors. Zero means identical vectors.
euclidean_distance_squared
Squared euclidean distance — avoids sqrt, used internally for comparisons.First vector.
Second vector.
Squared Euclidean distance. Faster than
euclidean_distance when only comparing relative distances.Utility Functions
normalize
Normalize a vector to unit length (for cosine optimization). Pre-normalizing vectors allows using dot_product instead of cosine (faster).Vector to normalize.
Normalized vector with unit length (magnitude = 1.0).
to_similarity_score
Convert distance to similarity score for consistent API output. All metrics return higher = more similar after this conversion.Raw distance value from a metric function.
The metric that produced the distance value.
Similarity score where higher = more similar. For Euclidean, converts to
1.0 / (1.0 + distance).Metric Comparison
Cosine
Best for text embeddings. Range: [-1, 1]. Ignores magnitude.
Euclidean
Best for spatial data. Range: [0, ∞). Considers magnitude.
Dot Product
Fastest for normalized vectors. Equivalent to cosine when vectors are unit length.
Complete Example
Performance Notes
- All functions are
#[inline]for zero-cost abstraction compute()is#[inline(always)]for optimal dispatch- Use
dot_productinstead ofcosine_similarityfor pre-normalized vectors - Use
euclidean_distance_squaredfor comparisons to avoid sqrt
When to Use Each Metric
Cosine Similarity (Default):- Text embeddings (e.g., OpenAI, Cohere)
- Semantic similarity
- When direction matters more than magnitude
- Image embeddings
- Spatial coordinates
- When absolute distance matters
- Pre-normalized embeddings (OpenAI ada-002)
- Maximum performance on unit vectors
- When you’ve already normalized inputs
See Also
- HNSWIndex - Uses these metrics for similarity search
- Merkle Tree - Cryptographic verification
- Encryption - Vector encryption