Benchmarks

Overview

VecLabs includes a comprehensive benchmark suite built with Criterion.rs, a statistical benchmarking framework for Rust. Benchmarks measure performance across HNSW operations, distance functions, and various vector dimensions.

The published benchmark results show p50 = 1.9ms, p95 = 2.8ms, p99 = 4.3ms for queries on 100K vectors at 384 dimensions.

Running Benchmarks

Benchmarks are located in the workspace and can be run with Cargo.

Run all benchmarks

cargo bench --workspace

This runs the complete benchmark suite including HNSW operations and distance functions.

Run specific benchmark

To run only HNSW benchmarks:

cargo bench --bench hnsw_bench

To run only distance function benchmarks:

cargo bench --bench distance_bench

View benchmark reports

Criterion generates detailed HTML reports in target/criterion/:

open target/criterion/report/index.html

Benchmark Suites

HNSW Operations

The hnsw_bench.rs benchmark measures core HNSW operations:

Insert Performance

Measures insertion time for building indices of 1K and 10K vectors

Query Performance

Measures query latency with top-K values of 1, 10, and 100

Multi-dimensional Query

Tests query performance across 128, 384, 768, and 1536 dimensions

Index Size Scaling

Validates query performance with 10K vector indices

Distance Functions

The distance_bench.rs benchmark measures raw distance calculation performance:

Cosine similarity: Used for most embedding models
Euclidean distance: L2 distance for spatial data
Dot product: Fast inner product computation

Each function is benchmarked across 128, 384, 768, and 1536 dimensions.

Benchmark Configuration

Benchmarks are configured in Cargo.toml:

[[bench]]
name = "hnsw_bench"
harness = false

[[bench]]
name = "distance_bench"
harness = false

The harness = false setting tells Cargo to use Criterion’s custom harness instead of the default test harness.

Understanding Results

Criterion provides statistical analysis of benchmark results:

Console Output
HTML Reports

hnsw_query/index_10000_topk/10
                        time:   [1.87 ms 1.92 ms 1.98 ms]
                        change: [-2.3% +0.5% +3.8%] (p = 0.65 > 0.05)

time: Median time with confidence interval
change: Performance change vs. previous run
p-value: Statistical significance

Performance Targets

VecLabs aims for the following performance characteristics:

Operation	Target	Measured (M2, 16GB)
Query p50 (100K vectors, 384d)	< 2ms	1.9ms
Query p95 (100K vectors, 384d)	< 3ms	2.8ms
Query p99 (100K vectors, 384d)	< 5ms	4.3ms
Insert 10K vectors	< 10s	~8.2s
Cosine similarity (384d)	< 10µs	~6.3µs

All performance targets are currently met or exceeded on Apple M2 hardware.

Comparing with Other Vector Databases

VecLabs outperforms popular vector databases on both latency and cost:

vs. Pinecone s1

4.2x faster at p50 (1.9ms vs ~8ms)
5.4x faster at p95 (2.8ms vs ~15ms)
8.8x cheaper ( $8/mo vs$ 70/mo for 1M vectors)

vs. Qdrant

2.1x faster at p50 (1.9ms vs ~4ms)
3.2x faster at p95 (2.8ms vs ~9ms)
3.1x cheaper ( $8/mo vs$ 25+/mo)

vs. Weaviate

6.3x faster at p50 (1.9ms vs ~12ms)
8.9x faster at p95 (2.8ms vs ~25ms)
3.1x cheaper ( $8/mo vs$ 25+/mo)

Unique Features

Data ownership: Encrypted with your Solana wallet
Audit trail: On-chain Merkle roots
No GC latency: Pure Rust, no garbage collector

Full benchmark methodology is documented in the README. Comparison benchmarks are measured on equivalent hardware configurations.

Benchmark Parameters

The benchmark suite uses the following HNSW parameters:

let mut index = HNSWIndex::new(
    16,   // M: connections per layer
    200,  // ef_construction: candidates during insert
    DistanceMetric::Cosine
);

These parameters balance query speed, recall, and memory usage:

M = 16: Higher values improve recall but increase memory
ef_construction = 200: Higher values improve graph quality but slow inserts
Cosine distance: Most common metric for text embeddings

Running Benchmarks on Different Hardware

To compare performance on your hardware:

Run baseline benchmarks

cargo bench --workspace

Save the baseline

cp -r target/criterion target/criterion-baseline

Make code changes

Edit the HNSW implementation or distance functions.

Compare performance

cargo bench --workspace

Criterion automatically compares against the previous run and shows regression/improvement.

Interpreting Benchmark Plots

Criterion generates several types of plots:

PDF Plot (Probability Density Function)

Shows the distribution of benchmark times. A narrow peak indicates consistent performance; a wide or multi-modal distribution suggests variance.

Regression Plot

Shows mean execution time across iterations. Should be relatively flat; upward trends indicate warmup or memory pressure.

Iteration Times

Raw scatter plot of all measurements. Outliers may indicate GC pauses (not applicable to Rust) or OS scheduling.

Comparison Plot

Side-by-side violin plots comparing current run vs. previous baseline.

Benchmark Best Practices

For accurate benchmarks:

Close resource-intensive applications
Run on AC power (not battery)
Disable CPU frequency scaling if possible
Run multiple times and compare results

Custom Benchmarks

To add your own benchmarks:

Create a new file in benchmarks/ or crates/solvec-core/benches/
Follow the Criterion.rs API:

use criterion::{black_box, criterion_group, criterion_main, Criterion};

fn my_benchmark(c: &mut Criterion) {
    c.bench_function("my_operation", |b| {
        b.iter(|| {
            // Code to benchmark
            black_box(some_function());
        });
    });
}

criterion_group!(benches, my_benchmark);
criterion_main!(benches);

[[bench]]
name = "my_benchmark"
harness = false

CI Benchmarks

Continuous benchmark tracking is planned for future releases. This will automatically detect performance regressions in pull requests.

Contributing

Resources

Overview

Running Benchmarks

Benchmark Suites

HNSW Operations

Insert Performance

Query Performance

Multi-dimensional Query

Index Size Scaling

Distance Functions

Benchmark Configuration

Understanding Results

Performance Targets

Comparing with Other Vector Databases

vs. Pinecone s1

vs. Qdrant

vs. Weaviate

Unique Features

Benchmark Parameters

Running Benchmarks on Different Hardware

Interpreting Benchmark Plots

Benchmark Best Practices

Custom Benchmarks

CI Benchmarks

Next Steps

Building from Source

Running Tests

Build docs developers (and LLMs) love

Contributing

Resources

​Overview

​Running Benchmarks

​Benchmark Suites

​HNSW Operations

Insert Performance

Query Performance

Multi-dimensional Query

Index Size Scaling

​Distance Functions

​Benchmark Configuration

​Understanding Results

​Performance Targets

​Comparing with Other Vector Databases

vs. Pinecone s1

vs. Qdrant

vs. Weaviate

Unique Features

​Benchmark Parameters

​Running Benchmarks on Different Hardware

​Interpreting Benchmark Plots

​Benchmark Best Practices

​Custom Benchmarks

​CI Benchmarks

​Next Steps

Building from Source

Running Tests

Build docs developers (and LLMs) love

Overview

Running Benchmarks

Benchmark Suites

HNSW Operations

Distance Functions

Benchmark Configuration

Understanding Results

Performance Targets

Comparing with Other Vector Databases

Benchmark Parameters

Running Benchmarks on Different Hardware

Interpreting Benchmark Plots

Benchmark Best Practices

Custom Benchmarks

CI Benchmarks

Next Steps