Performance Tuning Guide

VecLabs uses HNSW (Hierarchical Navigable Small World) graphs for fast approximate nearest neighbor search. This guide covers tuning parameters for optimal speed and accuracy.

HNSW Overview

HNSW is the gold standard for vector search:

Fast Queries

Sub-millisecond queries for millions of vectorsO(log n) search complexity

High Recall

95-99% recall with proper tuningBetter than locality-sensitive hashing

Memory Efficient

Compact graph structureMuch smaller than exhaustive search

Production-Ready

Used by Spotify, Weaviate, QdrantBattle-tested at scale

How HNSW Works

HNSW builds a multi-layer graph:

Layer 2:  [Entry] ---------> [Node]
            |
            v
Layer 1:  [A] -----> [B] -----> [C]
          |  \       / |         |
          |   \     /  |         |
Layer 0:  [1]--[2]--[3]--[4]--[5]--[6]--[7]

Start at entry point

Search begins at the top layer (sparsest)

Greedy descent

Move to nearest neighbor until local minimum reached

Drop to next layer

Descend to denser layer, repeat search

Base layer search

At layer 0 (densest), use beam search with ef_search parameter

Return top-K

Return K best candidates found

Key Parameters

1. ef_search (Query-Time)

What it does: Controls beam width during search at the base layer Effect:

Higher = better recall, slower queries
Lower = faster queries, worse recall

Default: 50 Typical range: 10-500

TypeScript
Rust Core (Advanced)

import { SolVec } from 'solvec';

const sv = new SolVec({ network: 'devnet' });
const col = sv.collection('vectors', { dimensions: 1536 });

// The SDK uses HNSW internally with ef_search=50 by default
// For production use, tune based on your requirements

use solvec_core::hnsw::HNSWIndex;
use solvec_core::types::DistanceMetric;

let mut index = HNSWIndex::new(16, 200, DistanceMetric::Cosine);

// Adjust ef_search for your use case
index.set_ef_search(100); // Higher = better recall

let results = index.query(&query_vector, 10)?;

Tuning ef_search

ef_search	Use Case	Recall	Speed
10-20	Ultra-fast search, recall not critical	~85%	Fastest
30-50	Balanced (default)	~95%	Fast
100-200	High accuracy required	~98%	Medium
300-500	Maximum accuracy	~99%+	Slower

For production, start with ef_search=50. Increase if you need better recall, decrease if speed is critical.

2. M (Max Connections)

What it does: Maximum edges per node in the graph Effect:

Higher = better recall, more memory, slower indexing
Lower = less memory, faster indexing, worse recall

Default: 16 Typical range: 8-64

// Rust core (TypeScript SDK uses sensible defaults)
let index = HNSWIndex::new(
    16,  // M: max connections per node
    200, // ef_construction
    DistanceMetric::Cosine
);

Tuning M

M	Memory Usage	Recall	Index Time
8	Low	~92%	Fast
16	Medium (default)	~95%	Medium
32	High	~97%	Slow
64	Very High	~99%	Very Slow

Changing M requires rebuilding the index. Choose wisely during initial setup.

3. ef_construction (Index-Time)

What it does: Beam width during index construction Effect:

Higher = better graph quality, slower indexing
Lower = faster indexing, worse recall

Default: 200 Typical range: 100-400

let index = HNSWIndex::new(
    16,
    200, // ef_construction: higher = better quality
    DistanceMetric::Cosine
);

Tuning ef_construction

ef_construction	Index Time	Graph Quality	Recommended For
100-150	Fast	Good	Development, frequent updates
200-300	Medium (default)	Excellent	Production
400-500	Slow	Maximum	Static datasets

ef_construction only affects indexing speed, not query speed. Higher is almost always better for production.

Performance Benchmarks

Query Speed vs Collection Size

Vectors	ef_search=50	ef_search=100	ef_search=200
1K	0.5ms	0.8ms	1.2ms
10K	2ms	3ms	5ms
100K	8ms	12ms	20ms
1M	25ms	40ms	70ms
10M	80ms	150ms	280ms

Benchmarks measured on M1 MacBook Pro with 1536-dimensional vectors and cosine similarity.

Recall vs ef_search

Recall %
100% |                    _______________
 99% |                ___/
 98% |            ___/
 97% |        ___/
 96% |     __/
 95% |   _/
 94% | _/
 93% |/
     +-------------------------------------> ef_search
     10   20   50   100  200  300  500

Sweet spot: ef_search = 50-100 (95-98% recall)

Optimizing for Different Use Cases

Use Case 1: Real-Time Search (Speed Priority)

Goal: Sub-10ms queries, 90%+ recall acceptable

let mut index = HNSWIndex::new(
    12,  // Lower M for speed
    150, // Lower ef_construction (faster indexing)
    DistanceMetric::Cosine
);

index.set_ef_search(30); // Aggressive speed optimization

Expected:

Query time: 1-5ms for 100K vectors
Recall: ~92%
Memory: Low

Use Case 2: Balanced (Default)

Goal: Fast queries with high accuracy

let mut index = HNSWIndex::default_cosine();
// M=16, ef_construction=200, ef_search=50

Expected:

Query time: 5-15ms for 100K vectors
Recall: ~95%
Memory: Medium

Use Case 3: Maximum Accuracy

Goal: 99%+ recall, speed secondary

let mut index = HNSWIndex::new(
    32,  // Higher M for accuracy
    400, // High ef_construction
    DistanceMetric::Cosine
);

index.set_ef_search(200);

Expected:

Query time: 20-50ms for 100K vectors
Recall: ~99%
Memory: High

Use Case 4: Massive Scale (10M+ Vectors)

Goal: Scale to millions of vectors

let mut index = HNSWIndex::new(
    16,  // Standard M
    200, // Standard ef_construction
    DistanceMetric::Cosine
);

// Start low, increase if needed
index.set_ef_search(50);

Strategies:

Partition data into multiple collections
Use coarse-to-fine search (search summary vectors first)
Increase ef_search only for critical queries
Consider dimensionality reduction (1536 → 768)

Batch Operations

Batch Upsert

Always batch upserts for better performance:

TypeScript
Python

// ❌ BAD: 1000 individual upserts
for (const vec of vectors) {
  await col.upsert([vec]); // Slow!
}

// ✅ GOOD: Single batch upsert
await col.upsert(vectors); // Fast!

# ❌ BAD: 1000 individual upserts
for vec in vectors:
    col.upsert([vec])  # Slow!

# ✅ GOOD: Single batch upsert
col.upsert(vectors)  # Fast!

Performance difference:

Individual: ~10ms per vector (1000 vectors = 10 seconds)
Batch: ~5ms total (1000 vectors = 5ms)

Reason: Avoids repeated Merkle root recomputation and on-chain updates.

Optimal Batch Size

// Split large datasets into reasonable batches
const BATCH_SIZE = 1000;

for (let i = 0; i < allVectors.length; i += BATCH_SIZE) {
  const batch = allVectors.slice(i, i + BATCH_SIZE);
  await col.upsert(batch);
  console.log(`Upserted ${i + batch.length} / ${allVectors.length}`);
}

Recommended batch sizes:

Small vectors (< 512 dim): 5000-10000
Medium vectors (512-1536 dim): 1000-5000
Large vectors (> 1536 dim): 500-1000

Memory Optimization

Dimensionality Reduction

Reduce vector dimensions for faster queries and less memory:

from sklearn.decomposition import PCA
import numpy as np

# Original OpenAI embeddings: 1536 dimensions
embeddings = np.array([...])  # shape: (n, 1536)

# Reduce to 768 dimensions
pca = PCA(n_components=768)
reduced = pca.fit_transform(embeddings)  # shape: (n, 768)

# Upsert reduced vectors
col = sv.collection("reduced", dimensions=768)
col.upsert([{"id": f"vec_{i}", "values": vec.tolist()} for i, vec in enumerate(reduced)])

Memory savings:

1536D: ~6 KB per vector
768D: ~3 KB per vector (50% reduction)
384D: ~1.5 KB per vector (75% reduction)

Accuracy impact:

1536D → 768D: ~1-2% recall loss
1536D → 384D: ~3-5% recall loss

For most semantic search use cases, 768 dimensions is sufficient and provides 2x memory savings.

Metadata Optimization

// ❌ BAD: Large metadata
await col.upsert([{
  id: 'doc_1',
  values: [...],
  metadata: {
    fullText: longDocument,     // 10 KB
    rawData: JSON.stringify(obj), // 5 KB
    imageBase64: base64Image    // 100 KB
  }
}]);

// ✅ GOOD: Minimal metadata
await col.upsert([{
  id: 'doc_1',
  values: [...],
  metadata: {
    title: 'Document Title',
    url: 'https://...',
    docId: 'doc_1'
  }
}]);

// Store full content elsewhere (S3, Shadow Drive, etc.)

Best practices:

Store only what you need for ranking/filtering
Keep metadata < 1 KB per vector
Store large content in external storage
Use IDs to reference external data

Distance Metric Performance

Metric	Speed	Use Case
Dot Product	Fastest	Pre-normalized vectors
Cosine	Fast	General purpose (auto-normalizes)
Euclidean	Fast	Image/audio embeddings

// Dot product is fastest but requires normalization
function normalize(vec: number[]): number[] {
  const norm = Math.sqrt(vec.reduce((sum, v) => sum + v * v, 0));
  return vec.map(v => v / norm);
}

const col = sv.collection('fast-search', {
  dimensions: 1536,
  metric: 'dot'  // Fastest metric
});

const normalized = normalize(embedding);
await col.upsert([{ id: 'vec_1', values: normalized }]);

Only use dot metric if your vectors are normalized to unit length. Otherwise, use cosine.

Monitoring Performance

Query Timing

const start = Date.now();
const results = await col.query({ vector: [...], topK: 10 });
const latency = Date.now() - start;

console.log(`Query took ${latency}ms`);

if (latency > 50) {
  console.warn('Query slower than expected. Consider tuning.');
}

Collection Statistics

const stats = await col.describeIndexStats();

console.log({
  vectorCount: stats.vectorCount,
  dimension: stats.dimension,
  metric: stats.metric,
  estimatedMemory: `${(stats.vectorCount * stats.dimension * 4 / 1024 / 1024).toFixed(2)} MB`
});

Troubleshooting

Problem: Slow Queries

Check collection size

If > 1M vectors, consider partitioning

Reduce ef_search

Try lowering from 50 to 30

Reduce dimensions

Use PCA to reduce from 1536 to 768

Optimize metadata

Remove large metadata fields

Problem: Low Recall

Increase ef_search

Try 100 or 200

Check data quality

Ensure vectors are properly normalized

Increase M

Rebuild index with M=32 (requires reindexing)

Verify metric

Ensure you’re using the right distance metric

Problem: High Memory Usage

Reduce dimensions

Use PCA dimensionality reduction

Minimize metadata

Store large fields externally

Lower M

Rebuild index with M=8 or M=12

Partition data

Split into multiple collections

Best Practices Summary

Batch Everything

Always batch upserts (1000+ vectors at once)Avoid individual operations

Start Conservative

Use default parameters firstTune only if needed

Monitor Performance

Track query latency and recallAdjust based on metrics

Minimize Metadata

Keep metadata < 1 KB per vectorStore large content externally

Next Steps

Verification

Learn about on-chain verification

Collections

Advanced collection management

TypeScript Guide

Complete TypeScript SDK reference

Python Guide

Complete Python SDK reference

Get Started

Core Concepts

Guides

Migration

Examples

​Performance Tuning Guide

​HNSW Overview

Fast Queries

High Recall

Memory Efficient

Production-Ready

​How HNSW Works

​Key Parameters

​1. ef_search (Query-Time)

​Tuning ef_search

​2. M (Max Connections)

​Tuning M

​3. ef_construction (Index-Time)

​Tuning ef_construction

​Performance Benchmarks

​Query Speed vs Collection Size

​Recall vs ef_search

​Optimizing for Different Use Cases

​Use Case 1: Real-Time Search (Speed Priority)

​Use Case 2: Balanced (Default)

​Use Case 3: Maximum Accuracy

​Use Case 4: Massive Scale (10M+ Vectors)

​Batch Operations

​Batch Upsert

​Optimal Batch Size

​Memory Optimization

​Dimensionality Reduction

​Metadata Optimization

​Distance Metric Performance

​Monitoring Performance

​Query Timing

​Collection Statistics

​Troubleshooting

​Problem: Slow Queries

​Problem: Low Recall

​Problem: High Memory Usage

​Best Practices Summary

Batch Everything

Start Conservative

Monitor Performance

Minimize Metadata

​Next Steps

Verification

Collections

TypeScript Guide

Python Guide

Build docs developers (and LLMs) love

Performance Tuning Guide

HNSW Overview

How HNSW Works

Key Parameters

1. ef_search (Query-Time)

Tuning ef_search

2. M (Max Connections)

Tuning M

3. ef_construction (Index-Time)

Tuning ef_construction

Performance Benchmarks

Query Speed vs Collection Size

Recall vs ef_search

Optimizing for Different Use Cases

Use Case 1: Real-Time Search (Speed Priority)

Use Case 2: Balanced (Default)

Use Case 3: Maximum Accuracy

Use Case 4: Massive Scale (10M+ Vectors)

Batch Operations

Batch Upsert

Optimal Batch Size

Memory Optimization

Dimensionality Reduction

Metadata Optimization

Distance Metric Performance

Monitoring Performance

Query Timing

Collection Statistics

Troubleshooting

Problem: Slow Queries

Problem: Low Recall

Problem: High Memory Usage

Best Practices Summary

Next Steps