What is HNSW?
HNSW is a graph-based algorithm for approximate nearest neighbor (ANN) search. Instead of comparing your query against every vector in the collection (slow), it builds a multi-layer graph that lets you navigate to the nearest neighbors in logarithmic time. Think of it like a highway system:- Top layers: Fast highways that cover long distances between major cities
- Bottom layer: Local streets that get you to the exact address
HNSW was introduced in the 2016 paper “Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs” by Malkov and Yashunin. It’s the same algorithm used by Pinecone, Qdrant, and Weaviate.
Core Parameters
Fromhnsw.rs:36-50, the VecLabs HNSW implementation exposes three key parameters:
M
Max connections per node
Default:
Default:
16Higher M = better recall, more memory usage. Standard value is 16. Layer 0 uses M * 2 = 32 connections.ef_construction
Build beam width
Default:
Default:
200Higher ef_construction = better quality graph, slower inserts. Determines how many candidates are explored when inserting a new vector.ef_search
Query beam width
Default:
Default:
50Higher ef_search = better recall, slower queries. You can tune this at query time to trade speed for accuracy.Choosing Parameters
For most users: Use the defaults. CallHNSWIndex::default_cosine() (hnsw.rs:95-97):
ef_search at query time:
How Insertion Works
When you insert a vector into the HNSW index (hnsw.rs:122-169):
- Generate random level — Each vector is assigned a random level (0 to ~4). Higher levels are exponentially less likely.
- Find nearest neighbors at each layer — Starting from the entry point, greedily descend to the insert level, then search for the M nearest neighbors at each layer.
- Create bidirectional connections — Connect the new node to its neighbors, and connect those neighbors back to the new node.
- Prune if necessary — If any neighbor now has more than M connections, prune the weakest ones to maintain graph quality.
hnsw.rs:350-438, the connect_new_node function handles this:
How Query Works
Querying is a two-phase process (hnsw.rs:210-259):
Phase 1: Greedy Descent
Start at the entry point (the highest-level node) and greedily descend through layers 1+ with beam width 1. This gets you close to the target region fast.Phase 2: Beam Search at Layer 0
At the base layer (layer 0), use the fullef_search beam width to explore candidates and return the top-K.
search_layer function (hnsw.rs:263-348) implements beam search:
Distance Metrics
VecLabs supports three distance metrics (distance.rs:14-59):
Cosine
Default metric
Returns similarity in [-1, 1]
Higher = more similar
Returns similarity in [-1, 1]
Higher = more similar
Euclidean
L2 distance
Returns distance in [0, ∞)
Lower = more similar
Returns distance in [0, ∞)
Lower = more similar
Dot Product
Inner product
Returns scalar
Higher = more similarBest for pre-normalized vectors (e.g., OpenAI embeddings).
Returns scalar
Higher = more similar
Performance Characteristics
Time Complexity
- Insert: O(log N * M * ef_construction)
- Query: O(log N * ef_search)
- Delete: O(M * L) where L is number of layers
Space Complexity
Memory usage per vector:- Vector: 1,536 bytes
- Graph: ~256 bytes
- Total: ~1.8 KB per vector
Benchmarks
Measured on Apple M2, 16GB RAM:| Vectors | Dimensions | p50 | p95 | p99 |
|---|---|---|---|---|
| 10K | 384 | 0.8ms | 1.2ms | 1.8ms |
| 100K | 384 | 1.9ms | 2.8ms | 4.3ms |
| 1M | 384 | 3.2ms | 5.1ms | 7.8ms |
hnsw.rs:649-658):
Serialization
The entire index can be serialized to JSON for persistence (hnsw.rs:470-478):
Index Statistics
You can inspect index health withstats() (hnsw.rs:481-491):
Why Rust Matters
The HNSW algorithm itself isn’t unique to VecLabs — Pinecone, Qdrant, and Weaviate all use HNSW. What’s different is the implementation language.Python/Go (Competitors)
- Garbage collector pauses query execution unpredictably
- GC pressure increases with collection size and query load
- p99 latency degrades as you scale
- Can’t guarantee sub-10ms under production load
Rust (VecLabs)
- No garbage collector = no pause-the-world events
- Memory is managed explicitly with compile-time guarantees
- p99 latency stays consistent regardless of load
- Predictable performance is the unfair advantage
Code Reference
Full implementation:crates/solvec-core/src/hnsw.rs
Key functions:
insert()— hnsw.rs:122query()— hnsw.rs:210delete()— hnsw.rs:172search_layer()— hnsw.rs:263connect_new_node()— hnsw.rs:350
Next: Merkle Proofs
Learn how VecLabs uses Merkle trees to create cryptographic proof of collection state.