MetricType

Overview

MetricType defines the distance or similarity function used to compare vectors during search operations. The choice of metric affects search results and performance.

import zvec

print(zvec.MetricType.COSINE)
# Output: MetricType.COSINE

Available Metrics

MetricType

Euclidean distance (L2 norm). Measures straight-line distance between two vectors in Euclidean space.Formula: √(Σ(a[i] - b[i])²)Range: [0, ∞) (0 = identical, larger = more different)When to use: When magnitude matters, working with embeddings not normalized to unit length.

field = Field(
    name="embedding",
    dtype=DataType.VECTOR_FP32,
    dim=768,
    metric=MetricType.L2
)

MetricType

Inner Product (dot product). Computes the dot product of two vectors.Formula: Σ(a[i] × b[i])Range: (-∞, ∞) (larger = more similar)When to use: When vectors are already normalized, or working with models that output IP-optimized embeddings. Faster than cosine for unit-normalized vectors.

field = Field(
    name="embedding",
    dtype=DataType.VECTOR_FP32,
    dim=1536,
    metric=MetricType.IP
)

COSINE

MetricType

Cosine similarity/distance. Measures the cosine of the angle between two vectors, normalized by their magnitudes.Formula: 1 - (a · b) / (||a|| × ||b||) (as distance)Range: [0, 2] as distance (0 = identical direction, 2 = opposite)When to use: When direction matters more than magnitude, comparing text embeddings, semantic similarity.

field = Field(
    name="text_embedding",
    dtype=DataType.VECTOR_FP32,
    dim=384,
    metric=MetricType.COSINE
)

Metric Properties

All MetricType enum members have these properties:

name

str

The name of the metric as a string.

MetricType.COSINE.name  # "COSINE"

value

int

The internal integer value of the metric.

MetricType.COSINE.value  # 3

Usage Examples

Defining Vector Field with Metric

from zvec import Collection, Field, DataType, MetricType

schema = [
    Field(name="id", dtype=DataType.STRING, is_primary=True),
    Field(name="title", dtype=DataType.STRING),
    Field(
        name="text_embedding",
        dtype=DataType.VECTOR_FP32,
        dim=768,
        metric=MetricType.COSINE  # Use cosine similarity
    ),
    Field(
        name="image_embedding",
        dtype=DataType.VECTOR_FP16,
        dim=512,
        metric=MetricType.L2  # Use L2 distance
    )
]

collection = Collection.create(name="multimodal", schema=schema)

Querying with Different Metrics

from zvec import Collection, MetricType

collection = Collection("articles")

# The metric is defined in the schema, so the same query uses the
# appropriate distance function for each vector field

# Query vector field using COSINE metric
results = collection.query(
    vectors={"text_embedding": query_embedding},
    topn=10
)

for doc in results:
    print(f"ID: {doc.id}, Distance: {doc.score:.4f}")

Comparing Metrics

import numpy as np
from zvec import MetricType

# Two example vectors
vec_a = np.array([1.0, 2.0, 3.0])
vec_b = np.array([2.0, 3.0, 4.0])

# L2 distance
l2_dist = np.linalg.norm(vec_a - vec_b)
print(f"L2 distance: {l2_dist:.4f}")  # 1.7321

# Inner product
ip_score = np.dot(vec_a, vec_b)
print(f"Inner product: {ip_score:.4f}")  # 20.0000

# Cosine similarity
cos_sim = np.dot(vec_a, vec_b) / (np.linalg.norm(vec_a) * np.linalg.norm(vec_b))
cos_dist = 1 - cos_sim
print(f"Cosine distance: {cos_dist:.4f}")  # 0.0079

Choosing the Right Metric

Decision Guide

When to use COSINE

Best for:

Text embeddings from language models (BERT, GPT, etc.)
Semantic similarity tasks
When vector magnitude is not meaningful
Comparing documents of different lengths

Characteristics:

Normalized comparison (only direction matters)
Range-independent
Most common for text embeddings

Example use cases:

Document similarity
Semantic search
Recommendation systems
Question answering

When to use IP (Inner Product)

Best for:

Pre-normalized embeddings (unit vectors)
Maximum Inner Product Search (MIPS)
Models specifically trained for IP
Performance-critical applications with normalized vectors

Characteristics:

Fastest for unit-normalized vectors
Equivalent to cosine for normalized vectors
No magnitude normalization

Example use cases:

Retrieval with normalized embeddings
Recommendation systems with pre-normalized features
Real-time search with unit vectors

IP is not symmetric for non-normalized vectors. For unit-normalized vectors, IP is equivalent to cosine similarity.

When to use L2

Best for:

Embeddings where magnitude is meaningful
Image embeddings
Spatial data
When distance in Euclidean space matters

Characteristics:

Considers both direction and magnitude
Natural geometric interpretation
Can be slower than IP for high dimensions

Example use cases:

Image similarity
Spatial search
Anomaly detection
Clustering

Performance Comparison

Speed: IP ≈ COSINE > L2For normalized vectors, IP and COSINE have similar performance. L2 can be slower due to the square root operation, though many implementations optimize this.

Metric Equivalence for Normalized Vectors

For unit-normalized vectors (||v|| = 1):

# These are equivalent for normalized vectors:
ip_score = np.dot(vec_a, vec_b)  # Inner product
cos_sim = np.dot(vec_a, vec_b)   # Cosine similarity

# L2 distance relates to cosine:
l2_squared = 2 * (1 - cos_sim)

Working with Metrics in Reranking

MetricType is used in weighted reranking for score normalization:

from zvec import MetricType
from zvec.extension import WeightedReRanker

# Normalize scores based on the metric used
reranker = WeightedReRanker(
    topn=10,
    metric=MetricType.COSINE,  # Normalize assuming cosine distance
    weights={"title_vec": 2.0, "content_vec": 1.0}
)

results = collection.query(
    vectors={
        "title_vec": title_embedding,
        "content_vec": content_embedding
    },
    reranker=reranker
)

See WeightedReRanker for details on score normalization.

Common Pitfalls

IP vs COSINE: Using IP with non-normalized vectors can produce unexpected results. Always normalize vectors first, or use COSINE instead.

# ❌ Wrong: Using IP with non-normalized vectors
field = Field(
    name="embedding",
    metric=MetricType.IP
)
collection.insert({"embedding": [1.0, 2.0, 3.0]})  # Not normalized!

# ✅ Correct: Normalize first, or use COSINE
import numpy as np
vec = np.array([1.0, 2.0, 3.0])
normalized_vec = vec / np.linalg.norm(vec)
collection.insert({"embedding": normalized_vec.tolist()})

Metric Mismatch: Ensure the metric matches your embedding model’s training objective. Some models are optimized for specific metrics.

Initialization

Collection

Schema Types

Query Types

Index Parameters

Embedding Functions

Re-ranking

Types & Enums

Overview

Available Metrics

Metric Properties

Usage Examples

Defining Vector Field with Metric

Querying with Different Metrics

Comparing Metrics

Choosing the Right Metric

Decision Guide

Performance Comparison

Metric Equivalence for Normalized Vectors

Working with Metrics in Reranking

Common Pitfalls

See Also

Build docs developers (and LLMs) love

Initialization

Collection

Schema Types

Query Types

Index Parameters

Embedding Functions

Re-ranking

Types & Enums

​Overview

​Available Metrics

​Metric Properties

​Usage Examples

​Defining Vector Field with Metric

​Querying with Different Metrics

​Comparing Metrics

​Choosing the Right Metric

​Decision Guide

​Performance Comparison

​Metric Equivalence for Normalized Vectors

​Working with Metrics in Reranking

​Common Pitfalls

​See Also

Build docs developers (and LLMs) love

Overview

Available Metrics

Metric Properties

Usage Examples

Defining Vector Field with Metric

Querying with Different Metrics

Comparing Metrics

Choosing the Right Metric

Decision Guide

Performance Comparison

Metric Equivalence for Normalized Vectors

Working with Metrics in Reranking

Common Pitfalls

See Also