HNSW Index

The HNSW (Hierarchical Navigable Small World) index is the performance core of VecLabs. It’s what makes sub-5ms queries possible at 100K+ vector scale.

What is HNSW?

HNSW is a graph-based algorithm for approximate nearest neighbor (ANN) search. Instead of comparing your query against every vector in the collection (slow), it builds a multi-layer graph that lets you navigate to the nearest neighbors in logarithmic time. Think of it like a highway system:

Top layers: Fast highways that cover long distances between major cities
Bottom layer: Local streets that get you to the exact address

When you query, you start at the top layer and greedily descend until you reach the base layer, then do a more thorough search for the top-K results.

HNSW was introduced in the 2016 paper “Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs” by Malkov and Yashunin. It’s the same algorithm used by Pinecone, Qdrant, and Weaviate.

Core Parameters

From hnsw.rs:36-50, the VecLabs HNSW implementation exposes three key parameters:

pub struct HNSWIndex {
    m: usize,               // Max connections per node per layer
    ef_construction: usize, // Beam width during index build
    ef_search: usize,       // Beam width during query
    // ...
}

M

Max connections per node
Default: 16Higher M = better recall, more memory usage. Standard value is 16. Layer 0 uses M * 2 = 32 connections.

ef_construction

Build beam width
Default: 200Higher ef_construction = better quality graph, slower inserts. Determines how many candidates are explored when inserting a new vector.

ef_search

Query beam width
Default: 50Higher ef_search = better recall, slower queries. You can tune this at query time to trade speed for accuracy.

Choosing Parameters

For most users: Use the defaults. Call HNSWIndex::default_cosine() (hnsw.rs:95-97):

pub fn default_cosine() -> Self {
    Self::new(16, 200, DistanceMetric::Cosine)
}

If you need higher recall: Increase ef_search at query time:

index.set_ef_search(100); // Double the beam width for better recall
let results = index.query(&query_vector, 10)?;

If you have lots of memory and want maximum quality: Increase M and ef_construction at index creation:

let mut index = HNSWIndex::new(32, 400, DistanceMetric::Cosine);
// Warning: 2x memory usage, slower inserts, marginal recall improvement

How Insertion Works

When you insert a vector into the HNSW index (hnsw.rs:122-169):

Generate random level — Each vector is assigned a random level (0 to ~4). Higher levels are exponentially less likely.

// hnsw.rs:456-468
fn random_level(&self) -> usize {
    let mut rng = rand::thread_rng();
    let mut level = 0usize;
    loop {
        let r: f64 = rng.gen();
        if r > self.ml || level >= 16 {
            break;
        }
        level += 1;
    }
    level
}

Find nearest neighbors at each layer — Starting from the entry point, greedily descend to the insert level, then search for the M nearest neighbors at each layer.
Create bidirectional connections — Connect the new node to its neighbors, and connect those neighbors back to the new node.
Prune if necessary — If any neighbor now has more than M connections, prune the weakest ones to maintain graph quality.

From hnsw.rs:350-438, the connect_new_node function handles this:

fn connect_new_node(&mut self, node_id: &str, insert_level: usize) {
    // Greedy descent from top to insert_level+1
    for layer_idx in (insert_level + 1..=self.entry_point_level).rev() {
        let candidates = self.search_layer(&node_values, &current_nearest, 1, layer_idx);
        // ...
    }
    
    // From insert_level down to 0: search and connect
    for layer_idx in (0..=insert_level).rev() {
        let candidates = self.search_layer(
            &node_values,
            &current_nearest,
            self.ef_construction,
            layer_idx,
        );
        
        // Bidirectional connections with pruning
        // ...
    }
}

How Query Works

Querying is a two-phase process (hnsw.rs:210-259):

Phase 1: Greedy Descent

Start at the entry point (the highest-level node) and greedily descend through layers 1+ with beam width 1. This gets you close to the target region fast.

// hnsw.rs:237-244
let mut current_nearest = entry;
for layer_idx in (1..=self.entry_point_level).rev() {
    let candidates = self.search_layer(query_vector, &current_nearest, 1, layer_idx);
    if let Some(best) = candidates.into_iter().next() {
        current_nearest = best.id;
    }
}

Phase 2: Beam Search at Layer 0

At the base layer (layer 0), use the full ef_search beam width to explore candidates and return the top-K.

// hnsw.rs:246-258
let ef = self.ef_search.max(top_k);
let candidates = self.search_layer(query_vector, &current_nearest, ef, 0);

let results: Vec<QueryResult> = candidates
    .into_iter()
    .take(top_k)
    .map(|c| {
        let vec = &self.vectors[&c.id];
        QueryResult::new(c.id, c.score, vec.metadata.clone())
    })
    .collect();

The search_layer function (hnsw.rs:263-348) implements beam search:

fn search_layer(
    &self,
    query: &[f32],
    entry_id: &str,
    ef: usize,
    layer_idx: usize,
) -> Vec<Candidate> {
    let mut visited: HashSet<String> = HashSet::new();
    let mut candidates: BinaryHeap<Candidate> = BinaryHeap::new();
    let mut results: BinaryHeap<Candidate> = BinaryHeap::new();
    
    // Expand candidates, track visited, maintain top-ef results
    while let Some(current) = candidates.pop() {
        // Explore neighbors, update results if score > worst_result_score
        // ...
    }
    
    // Return results sorted by score descending
}

Distance Metrics

VecLabs supports three distance metrics (distance.rs:14-59):

Cosine

Default metric
Returns similarity in [-1, 1]
Higher = more similar

pub fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
    let dot = a.iter().zip(b).map(|(x,y)| x*y).sum();
    let norm_a = a.iter().map(|x| x*x).sum::<f32>().sqrt();
    let norm_b = b.iter().map(|x| x*x).sum::<f32>().sqrt();
    (dot / (norm_a * norm_b)).clamp(-1.0, 1.0)
}

Euclidean

L2 distance
Returns distance in [0, ∞)
Lower = more similar

pub fn euclidean_distance(a: &[f32], b: &[f32]) -> f32 {
    a.iter()
        .zip(b)
        .map(|(x, y)| (x - y) * (x - y))
        .sum::<f32>()
        .sqrt()
}

Dot Product

Inner product
Returns scalar
Higher = more similar

pub fn dot_product(a: &[f32], b: &[f32]) -> f32 {
    a.iter().zip(b).map(|(x, y)| x * y).sum()
}

Best for pre-normalized vectors (e.g., OpenAI embeddings).

Performance optimization: If your vectors are already normalized (like OpenAI embeddings), use DotProduct metric instead of Cosine. They’re mathematically equivalent for unit vectors, but dot product avoids the sqrt computation.

Performance Characteristics

Time Complexity

Insert: O(log N * M * ef_construction)
Query: O(log N * ef_search)
Delete: O(M * L) where L is number of layers

Space Complexity

Memory usage per vector:

Vector storage: dimension * 4 bytes (f32)
Graph edges:    M * 2 * 8 bytes (layer 0) + M * 8 bytes * num_layers
Metadata:       variable (JSON)

For 384-dimensional vectors with default M=16:

Vector: 1,536 bytes
Graph: ~256 bytes
Total: ~1.8 KB per vector

At 1 million vectors: ~1.8 GB RAM

Benchmarks

Measured on Apple M2, 16GB RAM:

Vectors	Dimensions	p50	p95	p99
10K	384	0.8ms	1.2ms	1.8ms
100K	384	1.9ms	2.8ms	4.3ms
1M	384	3.2ms	5.1ms	7.8ms

From the test suite (hnsw.rs:649-658):

#[test]
fn test_large_index_query_returns_results() {
    let mut index = HNSWIndex::new(16, 200, DistanceMetric::Cosine);
    for i in 0..1000 {
        index
            .insert(make_vector(&format!("v{}", i), random_vector(384)))
            .unwrap();
    }
    let results = index.query(&random_vector(384), 10).unwrap();
    assert_eq!(results.len(), 10);
}

Serialization

The entire index can be serialized to JSON for persistence (hnsw.rs:470-478):

pub fn to_json(&self) -> Result<String, SolVecError> {
    serde_json::to_string(self)
        .map_err(|e| SolVecError::SerializationError(e.to_string()))
}

pub fn from_json(json: &str) -> Result<Self, SolVecError> {
    serde_json::from_str(json)
        .map_err(|e| SolVecError::SerializationError(e.to_string()))
}

This is used for Shadow Drive persistence (in progress) and local caching.

Index Statistics

You can inspect index health with stats() (hnsw.rs:481-491):

pub fn stats(&self) -> IndexStats {
    IndexStats {
        vector_count: self.vectors.len(),
        layer_count: self.layers.len(),
        entry_point_level: self.entry_point_level,
        dimension: self.dimension.unwrap_or(0),
        total_inserts: self.total_inserts,
        total_deletes: self.total_deletes,
        metric: self.metric,
    }
}

Why Rust Matters

The HNSW algorithm itself isn’t unique to VecLabs — Pinecone, Qdrant, and Weaviate all use HNSW. What’s different is the implementation language.

Python/Go (Competitors)

Garbage collector pauses query execution unpredictably
GC pressure increases with collection size and query load
p99 latency degrades as you scale
Can’t guarantee sub-10ms under production load

Rust (VecLabs)

No garbage collector = no pause-the-world events
Memory is managed explicitly with compile-time guarantees
p99 latency stays consistent regardless of load
Predictable performance is the unfair advantage

This is why VecLabs can promise sub-5ms p99 at 100K vectors and actually deliver it in production.

Code Reference

Full implementation: crates/solvec-core/src/hnsw.rs Key functions:

insert() — hnsw.rs:122
query() — hnsw.rs:210
delete() — hnsw.rs:172
search_layer() — hnsw.rs:263
connect_new_node() — hnsw.rs:350

Next: Merkle Proofs

Learn how VecLabs uses Merkle trees to create cryptographic proof of collection state.

Get Started

Core Concepts

Guides

Migration

Examples

What is HNSW?

Core Parameters

M

ef_construction

ef_search

Choosing Parameters

How Insertion Works

How Query Works

Phase 1: Greedy Descent

Phase 2: Beam Search at Layer 0

Distance Metrics

Cosine

Euclidean

Dot Product

Performance Characteristics

Time Complexity

Space Complexity

Benchmarks

Serialization

Index Statistics

Why Rust Matters

Python/Go (Competitors)

Rust (VecLabs)

Code Reference

Next: Merkle Proofs

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Migration

Examples

​What is HNSW?

​Core Parameters

M

ef_construction

ef_search

​Choosing Parameters

​How Insertion Works

​How Query Works

​Phase 1: Greedy Descent

​Phase 2: Beam Search at Layer 0

​Distance Metrics

Cosine

Euclidean

Dot Product

​Performance Characteristics

​Time Complexity

​Space Complexity

​Benchmarks

​Serialization

​Index Statistics

​Why Rust Matters

Python/Go (Competitors)

Rust (VecLabs)

​Code Reference

Next: Merkle Proofs

Build docs developers (and LLMs) love

What is HNSW?

Core Parameters

Choosing Parameters

How Insertion Works

How Query Works

Phase 1: Greedy Descent

Phase 2: Beam Search at Layer 0

Distance Metrics

Performance Characteristics

Time Complexity

Space Complexity

Benchmarks

Serialization

Index Statistics

Why Rust Matters

Code Reference