Hybrid Search

Hybrid search combines multiple vector types to leverage both semantic similarity (dense vectors) and keyword matching (sparse vectors), providing more robust and accurate search results than either approach alone.

Why Hybrid Search?

Different vector types excel at different tasks:

Vector Type	Strengths	Weaknesses
Dense	Semantic understanding, synonyms, context	Misses exact keywords, domain terms
Sparse (BM25)	Exact matching, rare terms, acronyms	No semantic understanding
Hybrid	Best of both, robust across query types	Slightly more complex

Example Scenario

Query: “Python programming tutorial”

Dense only: Returns documents about snakes, since “python” the animal is common in training data
Sparse only: Misses “coding guide” (synonym)
Hybrid: Correctly balances exact “Python” matching with semantic understanding of “programming”

Setting Up Multi-Vector Collections

Define Multi-Vector Schema

Create a collection with both dense and sparse vector fields:

from zvec import CollectionSchema, VectorSchema, FieldSchema, DataType
from zvec import HnswIndexParam

schema = CollectionSchema(
    name="hybrid_docs",
    fields=[
        FieldSchema("id", DataType.INT64),
        FieldSchema("title", DataType.STRING),
        FieldSchema("content", DataType.STRING)
    ],
    vectors=[
        # Dense semantic embedding
        VectorSchema(
            name="dense_emb",
            data_type=DataType.VECTOR_FP32,
            dimension=768,
            index_param=HnswIndexParam(ef_construction=200, m=16)
        ),
        # Sparse keyword embedding
        VectorSchema(
            name="sparse_emb",
            data_type=DataType.SPARSE_VECTOR_FP32,
            index_param=HnswIndexParam(ef_construction=100, m=8)
        )
    ]
)

Initialize Collection

import zvec

zvec.init()
collection = zvec.create_and_open(
    path="./hybrid_collection",
    schema=schema
)

Prepare Embedding Functions

Set up both dense and sparse embedding generators:

from zvec.extension import DefaultLocalDenseEmbedding, BM25EmbeddingFunction

# Dense embeddings for semantic search
dense_fn = DefaultLocalDenseEmbedding()

# Sparse embeddings for keyword search
sparse_fn = BM25EmbeddingFunction(
    language="en",
    encoding_type="document"
)

Index Documents with Both Vectors

from zvec import Doc

documents = [
    "Python is a high-level programming language",
    "Machine learning algorithms in Python",
    "JavaScript web development tutorial"
]

docs = []
for i, text in enumerate(documents):
    doc = Doc(
        id=f"doc_{i}",
        fields={
            "id": i,
            "title": f"Document {i}",
            "content": text
        },
        vectors={
            "dense_emb": dense_fn.embed(text),
            "sparse_emb": sparse_fn.embed(text)
        }
    )
    docs.append(doc)

collection.insert(docs)

Multi-Vector Query Strategies

Zvec provides two reranking strategies to combine results from multiple vector fields:

Reciprocal Rank Fusion (RRF)

RRF combines results based on their ranks without relying on scores. It’s robust to different score scales:

from zvec import VectorQuery
from zvec.extension import RrfReRanker

# Generate query vectors
query_text = "Python programming guide"
query_sparse_fn = BM25EmbeddingFunction(language="en", encoding_type="query")

query_dense = dense_fn.embed(query_text)
query_sparse = query_sparse_fn.embed(query_text)

# Define multi-vector query
results = collection.query(
    vectors=[
        VectorQuery(field_name="dense_emb", vector=query_dense),
        VectorQuery(field_name="sparse_emb", vector=query_sparse)
    ],
    reranker=RrfReRanker(
        topn=10,
        rank_constant=60  # Smoothing parameter
    )
)

for doc in results:
    print(f"{doc.id}: {doc.field('content')}")
    print(f"RRF Score: {doc.score:.4f}\n")

How RRF Works

RRF score for document at rank r: 1 / (k + r + 1)

# Document appears at rank 2 in dense search, rank 5 in sparse search
# k = 60 (rank_constant)
score_dense = 1 / (60 + 2 + 1) = 0.0159
score_sparse = 1 / (60 + 5 + 1) = 0.0152
total_score = 0.0159 + 0.0152 = 0.0311

RRF Advantages:

No score calibration needed
Robust to different distance metrics
Gives more weight to documents appearing in multiple result sets
Default rank_constant=60 works well for most cases

Weighted Score Fusion

Weight different vector fields based on their importance:

from zvec.extension import WeightedReRanker
from zvec import MetricType

# Define weights for each field
results = collection.query(
    vectors=[
        VectorQuery(field_name="dense_emb", vector=query_dense),
        VectorQuery(field_name="sparse_emb", vector=query_sparse)
    ],
    reranker=WeightedReRanker(
        topn=10,
        weights={
            "dense_emb": 0.7,    # 70% weight to semantic
            "sparse_emb": 0.3    # 30% weight to keywords
        },
        metric=MetricType.IP  # Score normalization method
    )
)

Score Normalization

Weighted reranker normalizes scores to [0, 1] before combining:

# Normalization formulas by metric type
if metric == MetricType.L2:
    normalized = 1.0 - 2 * arctan(score) / π

elif metric == MetricType.IP:
    normalized = 0.5 + arctan(score) / π

elif metric == MetricType.COSINE:
    normalized = 1.0 - score / 2.0

Ensure the metric parameter matches the metric used in your vector index. Mismatched metrics will produce incorrect score normalization.

Tuning Weight Parameters

Start with Balanced Weights

weights = {
    "dense_emb": 0.5,
    "sparse_emb": 0.5
}

Adjust Based on Query Type

Different queries benefit from different balances:

# For semantic queries ("concepts similar to X")
semantic_weights = {"dense_emb": 0.8, "sparse_emb": 0.2}

# For exact match queries ("find document with code ABC123")
exact_weights = {"dense_emb": 0.2, "sparse_emb": 0.8}

# General purpose
balanced_weights = {"dense_emb": 0.6, "sparse_emb": 0.4}

Evaluate with Your Data

Test different weight combinations:

from zvec.extension import WeightedReRanker

weight_configs = [
    {"dense_emb": 0.5, "sparse_emb": 0.5},
    {"dense_emb": 0.6, "sparse_emb": 0.4},
    {"dense_emb": 0.7, "sparse_emb": 0.3},
]

for weights in weight_configs:
    reranker = WeightedReRanker(topn=10, weights=weights)
    results = collection.query(vectors=queries, reranker=reranker)
    # Evaluate results against ground truth

Advanced Patterns

Three-Vector Hybrid Search

Combine multiple dense models or add specialized vectors:

schema = CollectionSchema(
    name="advanced_hybrid",
    fields=[FieldSchema("id", DataType.INT64)],
    vectors=[
        VectorSchema("text_dense", DataType.VECTOR_FP32, dimension=768),
        VectorSchema("code_dense", DataType.VECTOR_FP32, dimension=512),
        VectorSchema("keyword_sparse", DataType.SPARSE_VECTOR_FP32)
    ]
)

# Query all three fields
results = collection.query(
    vectors=[
        VectorQuery(field_name="text_dense", vector=text_vec),
        VectorQuery(field_name="code_dense", vector=code_vec),
        VectorQuery(field_name="keyword_sparse", vector=sparse_vec)
    ],
    reranker=WeightedReRanker(
        topn=10,
        weights={
            "text_dense": 0.4,
            "code_dense": 0.4,
            "keyword_sparse": 0.2
        }
    )
)

Conditional Reranking

Choose reranking strategy based on query characteristics:

def hybrid_search(query_text, is_semantic_query=True):
    """Adaptive hybrid search"""
    query_dense = dense_fn.embed(query_text)
    query_sparse = sparse_fn.embed(query_text)
    
    vectors = [
        VectorQuery(field_name="dense_emb", vector=query_dense),
        VectorQuery(field_name="sparse_emb", vector=query_sparse)
    ]
    
    if is_semantic_query:
        # Semantic: favor dense vectors
        reranker = WeightedReRanker(
            topn=10,
            weights={"dense_emb": 0.8, "sparse_emb": 0.2}
        )
    else:
        # Keyword: use RRF for balanced fusion
        reranker = RrfReRanker(topn=10)
    
    return collection.query(vectors=vectors, reranker=reranker)

Filtering with Hybrid Search

Combine multi-vector search with metadata filters:

results = collection.query(
    vectors=[
        VectorQuery(field_name="dense_emb", vector=query_dense),
        VectorQuery(field_name="sparse_emb", vector=query_sparse)
    ],
    filter="id > 100 and id < 500",  # Pre-filter candidates
    reranker=RrfReRanker(topn=10)
)

Performance Optimization

Index Both Fields Properly

Use appropriate index parameters for each vector type:

vectors=[
    # Dense: higher quality index
    VectorSchema(
        "dense",
        DataType.VECTOR_FP32,
        dimension=768,
        index_param=HnswIndexParam(ef_construction=200, m=16)
    ),
    # Sparse: lower parameters (fewer non-zero dims)
    VectorSchema(
        "sparse",
        DataType.SPARSE_VECTOR_FP32,
        index_param=HnswIndexParam(ef_construction=100, m=8)
    )
]

Adjust topn Wisely

Balance quality and speed:

# Fast: retrieve few candidates from each field
reranker = RrfReRanker(topn=10)

# Thorough: retrieve more for better fusion
reranker = RrfReRanker(topn=100)

Cache Embeddings

# Cache query embeddings for repeated searches
query_cache = {}

def get_embeddings(text):
    if text not in query_cache:
        query_cache[text] = {
            "dense": dense_fn.embed(text),
            "sparse": sparse_fn.embed(text)
        }
    return query_cache[text]

Comparison: RRF vs Weighted

Criterion	RRF	Weighted
Setup	Simple, no tuning	Requires weight tuning
Score Scale	Handles any metric	Needs correct metric
Control	Limited (rank_constant only)	Fine-grained (per-field weights)
Best For	Quick setup, robust defaults	Production systems, domain-specific

Quick Decision Guide:

Starting out? Use RRF with defaults
Have evaluation data? Tune Weighted for optimal results
Mixing dense+sparse? Both work well
3+ vector fields? RRF is simpler

Common Pitfalls

Don’t:

Mix embedding models between index and query time
Use the same encoding_type for documents and queries in BM25
Set all weights to 0
Forget to normalize dense vectors if using cosine similarity

Do:

Keep embedding models consistent
Use encoding_type="document" for indexing, "query" for search
Validate weights sum to 1.0 (or close)
Test hybrid search against single-vector baselines

Next Steps

Understand Dense Vectors in depth
Learn about Sparse Vectors and BM25
Add Filtering to hybrid queries
Optimize with Performance Tuning
Explore Embedding Functions for more options

Get Started

Core Concepts

Guides

Integrations

Advanced

Hybrid Search

Why Hybrid Search?

Example Scenario

Setting Up Multi-Vector Collections

Multi-Vector Query Strategies

Reciprocal Rank Fusion (RRF)

How RRF Works

Weighted Score Fusion

Score Normalization

Tuning Weight Parameters

Advanced Patterns

Three-Vector Hybrid Search

Conditional Reranking

Filtering with Hybrid Search

Performance Optimization

Comparison: RRF vs Weighted

Common Pitfalls

Next Steps

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Integrations

Advanced

​Why Hybrid Search?

​Example Scenario

​Setting Up Multi-Vector Collections

​Multi-Vector Query Strategies

​Reciprocal Rank Fusion (RRF)

​How RRF Works

​Weighted Score Fusion

​Score Normalization

​Tuning Weight Parameters

​Advanced Patterns

​Three-Vector Hybrid Search

​Conditional Reranking

​Filtering with Hybrid Search

​Performance Optimization

​Comparison: RRF vs Weighted

​Common Pitfalls

​Next Steps

Build docs developers (and LLMs) love

Why Hybrid Search?

Example Scenario

Setting Up Multi-Vector Collections

Multi-Vector Query Strategies

Reciprocal Rank Fusion (RRF)

How RRF Works

Weighted Score Fusion

Score Normalization

Tuning Weight Parameters

Advanced Patterns

Three-Vector Hybrid Search

Conditional Reranking

Filtering with Hybrid Search

Performance Optimization

Comparison: RRF vs Weighted

Common Pitfalls

Next Steps