Skip to main content

Performance at Scale

Zvec delivers exceptional speed and efficiency, making it ideal for demanding production workloads. Built on Alibaba’s battle-tested Proxima vector search engine, Zvec provides production-grade, low-latency similarity search. Zvec Performance Benchmarks

Key Performance Characteristics

Speed

  • Searches billions of vectors in milliseconds
  • Optimized SIMD operations for faster vector computations
  • Memory-mapped file I/O for efficient data access
  • Multi-threaded query processing

Efficiency

  • Low memory footprint - In-process design eliminates network overhead
  • Minimal setup - No separate server processes or complex configurations
  • Direct memory access - Native C++ core with zero-copy operations

Benchmark Methodology

Our benchmarks measure three key metrics:

QPS

Queries per second - throughput under concurrent load

Latency

Response time including p50, p90, p95, and p99 percentiles

Recall

Accuracy of results compared to exact nearest neighbor search

Test Configurations

Benchmarks are conducted using the following configurations:
# Standard benchmark configuration
schema = zvec.CollectionSchema(
    name="benchmark",
    vectors=zvec.VectorSchema(
        "embedding", 
        zvec.DataType.VECTOR_FP32, 
        dimension=768
    ),
)

# HNSW index parameters for optimal performance
index_params = zvec.HnswIndexParams(
    metric_type=zvec.MetricType.IP,  # Inner Product
    m=16,  # Max connections per layer
    ef_construction=200  # Build-time quality
)
All benchmarks are run on consistent hardware: AWS c6i.4xlarge instances (16 vCPUs, 32 GB RAM) with NVMe SSD storage.

Benchmark Results

Query Performance

The benchmark tracks query processing metrics:
  • Total queries processed
  • Average latency - Mean response time per query
  • QPS - Queries per second throughput
  • Percentile latencies - p25, p50, p75, p90, p95, p99
  • Min/Max latency - Best and worst case performance
Example benchmark output from tools/core/bench_result.h:
Process query: 10000, total process time: 5420ms, duration: 5500ms, max: 12ms, min: 0ms
Avg latency: 0.5ms qps: 1818.2
25 Percentile:    0.3 ms
50 Percentile:    0.4 ms
75 Percentile:    0.6 ms
90 Percentile:    0.9 ms
95 Percentile:    1.2 ms
99 Percentile:    2.1 ms

Dataset Sizes

Zvec has been tested with datasets ranging from:
  • Small: 10K - 100K vectors
  • Medium: 100K - 1M vectors
  • Large: 1M - 10M vectors
  • X-Large: 10M+ vectors (billions supported)
For the most up-to-date benchmark results and detailed methodology, visit the Benchmarks documentation on our website.

Comparison with Other Vector Databases

As an in-process vector database, Zvec offers unique advantages:
FeatureZvecClient-Server DBs
DeploymentEmbeds directly in your appRequires separate server
LatencySub-millisecond (no network)Network overhead added
SetupInstall and goConfiguration required
ScalabilityScales with your appHorizontal scaling
Use CaseEdge, notebooks, appsCentralized services
Benchmark results vary based on hardware, dataset characteristics, query patterns, and index configuration. Always benchmark with your own data and workload.

Running Your Own Benchmarks

To benchmark Zvec with your own data:

1. Build the C++ benchmarking tool

git clone --recursive https://github.com/alibaba/zvec.git
cd zvec
pip install -e ".[dev]"

2. Prepare your dataset

Use standard benchmark datasets or your own vectors:

3. Configure test parameters

Adjust:
  • Vector dimension and data type
  • Index type (HNSW, IVF, Flat)
  • Query concurrency
  • Recall vs. speed tradeoff (ef_search)

4. Run benchmarks

# Example: Run C++ core benchmarks
python -m pytest python/tests/ -v --benchmark
Start with small datasets to validate your setup before scaling to production-size data.

Performance Tuning Tips

For optimal benchmark results:
  1. Index Selection - HNSW for speed, IVF for memory efficiency
  2. Parameter Tuning - Balance ef_construction and ef_search
  3. Hardware - Use SSD storage and sufficient RAM
  4. Query Patterns - Batch queries when possible
  5. Vector Preprocessing - Normalize vectors for IP distance
See the Performance Tuning guide for detailed optimization strategies.

Community Benchmarks

We welcome community-contributed benchmarks! If you’ve run interesting benchmarks: For benchmark-related issues or questions, please open an issue using our benchmark issue template.

Build docs developers (and LLMs) love