Performance Tuning Guide
VecLabs uses HNSW (Hierarchical Navigable Small World) graphs for fast approximate nearest neighbor search. This guide covers tuning parameters for optimal speed and accuracy.
HNSW Overview
HNSW is the gold standard for vector search:
Fast Queries Sub-millisecond queries for millions of vectors O(log n) search complexity
High Recall 95-99% recall with proper tuning Better than locality-sensitive hashing
Memory Efficient Compact graph structure Much smaller than exhaustive search
Production-Ready Used by Spotify, Weaviate, Qdrant Battle-tested at scale
How HNSW Works
HNSW builds a multi-layer graph:
Layer 2: [Entry] ---------> [Node]
|
v
Layer 1: [A] -----> [B] -----> [C]
| \ / | |
| \ / | |
Layer 0: [1]--[2]--[3]--[4]--[5]--[6]--[7]
Start at entry point
Search begins at the top layer (sparsest)
Greedy descent
Move to nearest neighbor until local minimum reached
Drop to next layer
Descend to denser layer, repeat search
Base layer search
At layer 0 (densest), use beam search with ef_search parameter
Return top-K
Return K best candidates found
Key Parameters
1. ef_search (Query-Time)
What it does : Controls beam width during search at the base layer
Effect :
Higher = better recall, slower queries
Lower = faster queries, worse recall
Default : 50
Typical range : 10-500
TypeScript
Rust Core (Advanced)
import { SolVec } from 'solvec' ;
const sv = new SolVec ({ network: 'devnet' });
const col = sv . collection ( 'vectors' , { dimensions: 1536 });
// The SDK uses HNSW internally with ef_search=50 by default
// For production use, tune based on your requirements
use solvec_core :: hnsw :: HNSWIndex ;
use solvec_core :: types :: DistanceMetric ;
let mut index = HNSWIndex :: new ( 16 , 200 , DistanceMetric :: Cosine );
// Adjust ef_search for your use case
index . set_ef_search ( 100 ); // Higher = better recall
let results = index . query ( & query_vector , 10 ) ? ;
Tuning ef_search
ef_search Use Case Recall Speed 10-20 Ultra-fast search, recall not critical ~85% Fastest 30-50 Balanced (default) ~95% Fast 100-200 High accuracy required ~98% Medium 300-500 Maximum accuracy ~99%+ Slower
For production, start with ef_search=50. Increase if you need better recall, decrease if speed is critical.
2. M (Max Connections)
What it does : Maximum edges per node in the graph
Effect :
Higher = better recall, more memory, slower indexing
Lower = less memory, faster indexing, worse recall
Default : 16
Typical range : 8-64
// Rust core (TypeScript SDK uses sensible defaults)
let index = HNSWIndex :: new (
16 , // M: max connections per node
200 , // ef_construction
DistanceMetric :: Cosine
);
Tuning M
M Memory Usage Recall Index Time 8 Low ~92% Fast 16 Medium (default) ~95% Medium 32 High ~97% Slow 64 Very High ~99% Very Slow
Changing M requires rebuilding the index. Choose wisely during initial setup.
3. ef_construction (Index-Time)
What it does : Beam width during index construction
Effect :
Higher = better graph quality, slower indexing
Lower = faster indexing, worse recall
Default : 200
Typical range : 100-400
let index = HNSWIndex :: new (
16 ,
200 , // ef_construction: higher = better quality
DistanceMetric :: Cosine
);
Tuning ef_construction
ef_construction Index Time Graph Quality Recommended For 100-150 Fast Good Development, frequent updates 200-300 Medium (default) Excellent Production 400-500 Slow Maximum Static datasets
ef_construction only affects indexing speed, not query speed. Higher is almost always better for production.
Query Speed vs Collection Size
Vectors ef_search=50 ef_search=100 ef_search=200 1K 0.5ms 0.8ms 1.2ms 10K 2ms 3ms 5ms 100K 8ms 12ms 20ms 1M 25ms 40ms 70ms 10M 80ms 150ms 280ms
Benchmarks measured on M1 MacBook Pro with 1536-dimensional vectors and cosine similarity.
Recall vs ef_search
Recall %
100% | _______________
99% | ___/
98% | ___/
97% | ___/
96% | __/
95% | _/
94% | _/
93% |/
+-------------------------------------> ef_search
10 20 50 100 200 300 500
Sweet spot: ef_search = 50-100 (95-98% recall)
Optimizing for Different Use Cases
Use Case 1: Real-Time Search (Speed Priority)
Goal : Sub-10ms queries, 90%+ recall acceptable
let mut index = HNSWIndex :: new (
12 , // Lower M for speed
150 , // Lower ef_construction (faster indexing)
DistanceMetric :: Cosine
);
index . set_ef_search ( 30 ); // Aggressive speed optimization
Expected :
Query time: 1-5ms for 100K vectors
Recall: ~92%
Memory: Low
Use Case 2: Balanced (Default)
Goal : Fast queries with high accuracy
let mut index = HNSWIndex :: default_cosine ();
// M=16, ef_construction=200, ef_search=50
Expected :
Query time: 5-15ms for 100K vectors
Recall: ~95%
Memory: Medium
Use Case 3: Maximum Accuracy
Goal : 99%+ recall, speed secondary
let mut index = HNSWIndex :: new (
32 , // Higher M for accuracy
400 , // High ef_construction
DistanceMetric :: Cosine
);
index . set_ef_search ( 200 );
Expected :
Query time: 20-50ms for 100K vectors
Recall: ~99%
Memory: High
Use Case 4: Massive Scale (10M+ Vectors)
Goal : Scale to millions of vectors
let mut index = HNSWIndex :: new (
16 , // Standard M
200 , // Standard ef_construction
DistanceMetric :: Cosine
);
// Start low, increase if needed
index . set_ef_search ( 50 );
Strategies :
Partition data into multiple collections
Use coarse-to-fine search (search summary vectors first)
Increase ef_search only for critical queries
Consider dimensionality reduction (1536 → 768)
Batch Operations
Batch Upsert
Always batch upserts for better performance:
// ❌ BAD: 1000 individual upserts
for ( const vec of vectors ) {
await col . upsert ([ vec ]); // Slow!
}
// ✅ GOOD: Single batch upsert
await col . upsert ( vectors ); // Fast!
# ❌ BAD: 1000 individual upserts
for vec in vectors:
col.upsert([vec]) # Slow!
# ✅ GOOD: Single batch upsert
col.upsert(vectors) # Fast!
Performance difference :
Individual: ~10ms per vector (1000 vectors = 10 seconds)
Batch: ~5ms total (1000 vectors = 5ms)
Reason : Avoids repeated Merkle root recomputation and on-chain updates.
Optimal Batch Size
// Split large datasets into reasonable batches
const BATCH_SIZE = 1000 ;
for ( let i = 0 ; i < allVectors . length ; i += BATCH_SIZE ) {
const batch = allVectors . slice ( i , i + BATCH_SIZE );
await col . upsert ( batch );
console . log ( `Upserted ${ i + batch . length } / ${ allVectors . length } ` );
}
Recommended batch sizes :
Small vectors (< 512 dim): 5000-10000
Medium vectors (512-1536 dim): 1000-5000
Large vectors (> 1536 dim): 500-1000
Memory Optimization
Dimensionality Reduction
Reduce vector dimensions for faster queries and less memory:
from sklearn.decomposition import PCA
import numpy as np
# Original OpenAI embeddings: 1536 dimensions
embeddings = np.array([ ... ]) # shape: (n, 1536)
# Reduce to 768 dimensions
pca = PCA( n_components = 768 )
reduced = pca.fit_transform(embeddings) # shape: (n, 768)
# Upsert reduced vectors
col = sv.collection( "reduced" , dimensions = 768 )
col.upsert([{ "id" : f "vec_ { i } " , "values" : vec.tolist()} for i, vec in enumerate (reduced)])
Memory savings :
1536D: ~6 KB per vector
768D: ~3 KB per vector (50% reduction)
384D: ~1.5 KB per vector (75% reduction)
Accuracy impact :
1536D → 768D: ~1-2% recall loss
1536D → 384D: ~3-5% recall loss
For most semantic search use cases, 768 dimensions is sufficient and provides 2x memory savings.
// ❌ BAD: Large metadata
await col . upsert ([{
id: 'doc_1' ,
values: [ ... ],
metadata: {
fullText: longDocument , // 10 KB
rawData: JSON . stringify ( obj ), // 5 KB
imageBase64: base64Image // 100 KB
}
}]);
// ✅ GOOD: Minimal metadata
await col . upsert ([{
id: 'doc_1' ,
values: [ ... ],
metadata: {
title: 'Document Title' ,
url: 'https://...' ,
docId: 'doc_1'
}
}]);
// Store full content elsewhere (S3, Shadow Drive, etc.)
Best practices :
Store only what you need for ranking/filtering
Keep metadata < 1 KB per vector
Store large content in external storage
Use IDs to reference external data
Metric Speed Use Case Dot Product Fastest Pre-normalized vectors Cosine Fast General purpose (auto-normalizes) Euclidean Fast Image/audio embeddings
// Dot product is fastest but requires normalization
function normalize ( vec : number []) : number [] {
const norm = Math . sqrt ( vec . reduce (( sum , v ) => sum + v * v , 0 ));
return vec . map ( v => v / norm );
}
const col = sv . collection ( 'fast-search' , {
dimensions: 1536 ,
metric: 'dot' // Fastest metric
});
const normalized = normalize ( embedding );
await col . upsert ([{ id: 'vec_1' , values: normalized }]);
Only use dot metric if your vectors are normalized to unit length. Otherwise, use cosine.
Query Timing
const start = Date . now ();
const results = await col . query ({ vector: [ ... ], topK: 10 });
const latency = Date . now () - start ;
console . log ( `Query took ${ latency } ms` );
if ( latency > 50 ) {
console . warn ( 'Query slower than expected. Consider tuning.' );
}
Collection Statistics
const stats = await col . describeIndexStats ();
console . log ({
vectorCount: stats . vectorCount ,
dimension: stats . dimension ,
metric: stats . metric ,
estimatedMemory: ` ${ ( stats . vectorCount * stats . dimension * 4 / 1024 / 1024 ). toFixed ( 2 ) } MB`
});
Troubleshooting
Problem: Slow Queries
Check collection size
If > 1M vectors, consider partitioning
Reduce ef_search
Try lowering from 50 to 30
Reduce dimensions
Use PCA to reduce from 1536 to 768
Optimize metadata
Remove large metadata fields
Problem: Low Recall
Increase ef_search
Try 100 or 200
Check data quality
Ensure vectors are properly normalized
Increase M
Rebuild index with M=32 (requires reindexing)
Verify metric
Ensure you’re using the right distance metric
Problem: High Memory Usage
Reduce dimensions
Use PCA dimensionality reduction
Minimize metadata
Store large fields externally
Lower M
Rebuild index with M=8 or M=12
Partition data
Split into multiple collections
Best Practices Summary
Batch Everything Always batch upserts (1000+ vectors at once) Avoid individual operations
Start Conservative Use default parameters first Tune only if needed
Monitor Performance Track query latency and recall Adjust based on metrics
Minimize Metadata Keep metadata < 1 KB per vector Store large content externally
Next Steps
Verification Learn about on-chain verification
Collections Advanced collection management
TypeScript Guide Complete TypeScript SDK reference
Python Guide Complete Python SDK reference