Skip to main content

Overview

VecLabs includes a comprehensive benchmark suite built with Criterion.rs, a statistical benchmarking framework for Rust. Benchmarks measure performance across HNSW operations, distance functions, and various vector dimensions.
The published benchmark results show p50 = 1.9ms, p95 = 2.8ms, p99 = 4.3ms for queries on 100K vectors at 384 dimensions.

Running Benchmarks

Benchmarks are located in the workspace and can be run with Cargo.
1

Run all benchmarks

cargo bench --workspace
This runs the complete benchmark suite including HNSW operations and distance functions.
2

Run specific benchmark

To run only HNSW benchmarks:
cargo bench --bench hnsw_bench
To run only distance function benchmarks:
cargo bench --bench distance_bench
3

View benchmark reports

Criterion generates detailed HTML reports in target/criterion/:
open target/criterion/report/index.html

Benchmark Suites

HNSW Operations

The hnsw_bench.rs benchmark measures core HNSW operations:

Insert Performance

Measures insertion time for building indices of 1K and 10K vectors

Query Performance

Measures query latency with top-K values of 1, 10, and 100

Multi-dimensional Query

Tests query performance across 128, 384, 768, and 1536 dimensions

Index Size Scaling

Validates query performance with 10K vector indices

Distance Functions

The distance_bench.rs benchmark measures raw distance calculation performance:
  • Cosine similarity: Used for most embedding models
  • Euclidean distance: L2 distance for spatial data
  • Dot product: Fast inner product computation
Each function is benchmarked across 128, 384, 768, and 1536 dimensions.

Benchmark Configuration

Benchmarks are configured in Cargo.toml:
[[bench]]
name = "hnsw_bench"
harness = false

[[bench]]
name = "distance_bench"
harness = false
The harness = false setting tells Cargo to use Criterion’s custom harness instead of the default test harness.

Understanding Results

Criterion provides statistical analysis of benchmark results:
hnsw_query/index_10000_topk/10
                        time:   [1.87 ms 1.92 ms 1.98 ms]
                        change: [-2.3% +0.5% +3.8%] (p = 0.65 > 0.05)
  • time: Median time with confidence interval
  • change: Performance change vs. previous run
  • p-value: Statistical significance

Performance Targets

VecLabs aims for the following performance characteristics:
OperationTargetMeasured (M2, 16GB)
Query p50 (100K vectors, 384d)< 2ms1.9ms
Query p95 (100K vectors, 384d)< 3ms2.8ms
Query p99 (100K vectors, 384d)< 5ms4.3ms
Insert 10K vectors< 10s~8.2s
Cosine similarity (384d)< 10µs~6.3µs
All performance targets are currently met or exceeded on Apple M2 hardware.

Comparing with Other Vector Databases

VecLabs outperforms popular vector databases on both latency and cost:

vs. Pinecone s1

  • 4.2x faster at p50 (1.9ms vs ~8ms)
  • 5.4x faster at p95 (2.8ms vs ~15ms)
  • 8.8x cheaper (8/movs8/mo vs 70/mo for 1M vectors)

vs. Qdrant

  • 2.1x faster at p50 (1.9ms vs ~4ms)
  • 3.2x faster at p95 (2.8ms vs ~9ms)
  • 3.1x cheaper (8/movs8/mo vs 25+/mo)

vs. Weaviate

  • 6.3x faster at p50 (1.9ms vs ~12ms)
  • 8.9x faster at p95 (2.8ms vs ~25ms)
  • 3.1x cheaper (8/movs8/mo vs 25+/mo)

Unique Features

  • Data ownership: Encrypted with your Solana wallet
  • Audit trail: On-chain Merkle roots
  • No GC latency: Pure Rust, no garbage collector
Full benchmark methodology is documented in the README. Comparison benchmarks are measured on equivalent hardware configurations.

Benchmark Parameters

The benchmark suite uses the following HNSW parameters:
let mut index = HNSWIndex::new(
    16,   // M: connections per layer
    200,  // ef_construction: candidates during insert
    DistanceMetric::Cosine
);
These parameters balance query speed, recall, and memory usage:
  • M = 16: Higher values improve recall but increase memory
  • ef_construction = 200: Higher values improve graph quality but slow inserts
  • Cosine distance: Most common metric for text embeddings

Running Benchmarks on Different Hardware

To compare performance on your hardware:
1

Run baseline benchmarks

cargo bench --workspace
2

Save the baseline

cp -r target/criterion target/criterion-baseline
3

Make code changes

Edit the HNSW implementation or distance functions.
4

Compare performance

cargo bench --workspace
Criterion automatically compares against the previous run and shows regression/improvement.

Interpreting Benchmark Plots

Criterion generates several types of plots:
Shows the distribution of benchmark times. A narrow peak indicates consistent performance; a wide or multi-modal distribution suggests variance.
Shows mean execution time across iterations. Should be relatively flat; upward trends indicate warmup or memory pressure.
Raw scatter plot of all measurements. Outliers may indicate GC pauses (not applicable to Rust) or OS scheduling.
Side-by-side violin plots comparing current run vs. previous baseline.

Benchmark Best Practices

For accurate benchmarks:
  • Close resource-intensive applications
  • Run on AC power (not battery)
  • Disable CPU frequency scaling if possible
  • Run multiple times and compare results

Custom Benchmarks

To add your own benchmarks:
  1. Create a new file in benchmarks/ or crates/solvec-core/benches/
  2. Follow the Criterion.rs API:
use criterion::{black_box, criterion_group, criterion_main, Criterion};

fn my_benchmark(c: &mut Criterion) {
    c.bench_function("my_operation", |b| {
        b.iter(|| {
            // Code to benchmark
            black_box(some_function());
        });
    });
}

criterion_group!(benches, my_benchmark);
criterion_main!(benches);
  1. Register in Cargo.toml:
[[bench]]
name = "my_benchmark"
harness = false

CI Benchmarks

Continuous benchmark tracking is planned for future releases. This will automatically detect performance regressions in pull requests.

Next Steps

Building from Source

Set up your development environment

Running Tests

Validate functionality with the test suite

Build docs developers (and LLMs) love