Overview
Zvec provides four primary index types for vector search:| Index Type | Use Case | Search Speed | Memory | Accuracy | Training Required |
|---|---|---|---|---|---|
| HNSW | High-performance ANN | Very Fast | High | High (>95%) | No |
| IVF | Large-scale datasets | Fast | Medium | Medium-High | Yes |
| Flat | Small datasets, exact search | Medium | Low | 100% (exact) | No |
| Inverted | Text/keyword search | Very Fast | Low | 100% (exact) | No |
HNSW (Hierarchical Navigable Small World)
Overview
HNSW is a graph-based approximate nearest neighbor (ANN) index that provides excellent recall with fast search times. It constructs a multi-layer graph structure where each node represents a vector.Configuration
Parameters
metric_type (MetricType)
Distance metric for similarity computation:
MetricType.IP- Inner product (dot product)MetricType.L2- Euclidean distanceMetricType.COSINE- Cosine similarity
MetricType.IP
m (int)
Number of bi-directional links created for each element during construction.
- Range: Typically 8-64
- Higher values:
- Better recall (accuracy)
- Increased memory usage (~4 * m * 4 bytes per vector)
- Slower construction
- Lower values:
- Faster construction
- Lower memory footprint
- May reduce recall
50Recommended: 16-32 for most applications
ef_construction (int)
Size of the dynamic candidate list during index construction.
- Range: Typically 100-2000
- Higher values:
- Better graph quality
- Higher recall
- Slower build time
- Lower values:
- Faster construction
- May impact recall
500Recommended: At least 2 * m, often 200-500
quantize_type (QuantizeType)
Vector compression method. See Quantization for details.
Default: QuantizeType.UNDEFINED (no quantization)
Query-Time Parameters
Control search behavior withHnswQueryParam:
ef (int)
Size of the dynamic candidate list during search.
- Range: Typically topk to 1000+
- Higher values: Better recall, slower search
- Lower values: Faster search, lower recall
300Recommendation: Set
ef >= topk for good results
Performance Tuning
Best Practices
- Balance m and ef_construction: Higher
mrequires higheref_constructionfor optimal graph quality - Query ef tuning: Start with
ef = 2 * topk, adjust based on recall requirements - Memory consideration: HNSW uses ~(4 + 4 * m) bytes per vector for graph structure
- No training required: Can insert vectors incrementally
IVF (Inverted File Index)
Overview
IVF partitions the vector space into clusters using k-means. At query time, only the nearest clusters are searched, providing a speed-accuracy trade-off.Configuration
Parameters
metric_type (MetricType)
Same as HNSW. See above.
n_list (int)
Number of clusters (inverted lists) to partition the dataset into.
- Auto mode (
n_list=0): System selects based on data size- Typical formula:
sqrt(N)where N is dataset size
- Typical formula:
- Manual mode: Set explicitly
- Range: 10 to 100,000+
- More clusters: Better accuracy, slower search
- Fewer clusters: Faster search, lower accuracy
0 (auto)Recommended:
sqrt(N) for balanced performance
n_iters (int)
Number of iterations for k-means clustering during training.
- Range: 5-50
- Higher values: More stable centroids, longer training
- Lower values: Faster training, less stable clusters
10
use_soar (bool)
Enable SOAR (Scalable Optimized Adaptive Routing) for improved IVF search.
Default: FalseRecommendation: Enable for large-scale datasets (>1M vectors)
Query-Time Parameters
nprobe (int)
Number of nearest clusters to probe during search.
- Range: 1 to
n_list - Higher values: Better recall, slower search
- Lower values: Faster search, lower recall
10Recommendation: 5-20% of
n_list for good recall
Training Requirement
IVF requires training on representative data before use:Performance Tuning
| Dataset Size | n_list | nprobe | Expected Recall |
|---|---|---|---|
| 10K | 100 | 10 | ~90% |
| 100K | 300 | 20 | ~92% |
| 1M | 1000 | 30 | ~93% |
| 10M | 4000 | 50 | ~94% |
Best Practices
- Training data: Use at least 1000-10000 representative vectors for training
- nprobe tuning: Start with
nprobe = sqrt(n_list), adjust based on recall needs - SOAR optimization: Enable for datasets >1M vectors
- Memory efficient: Lower memory footprint than HNSW for large datasets
Flat (Brute-Force)
Overview
Flat index performs exact nearest neighbor search by comparing the query vector against all vectors. It’s simple, accurate, and ideal for small datasets or baseline comparisons.Configuration
Parameters
metric_type (MetricType)
Same as HNSW and IVF.
quantize_type (QuantizeType)
Optional quantization for memory reduction. See Quantization.
When to Use Flat
- Small datasets (under 10,000 vectors)
- Exact search required (100% recall)
- Baseline for benchmarking other index types
- Development and testing
Performance Characteristics
Best Practices
- Use for under 10K vectors: Beyond this, consider HNSW or IVF
- Quantization: Apply FP16 or INT8 to reduce memory without indexing overhead
- Baseline testing: Always compare ANN indexes against Flat for recall validation
Inverted (Text/Keyword Index)
Overview
Inverted index is designed for keyword and text search, not vector similarity. It supports efficient filtering and text queries.Configuration
Parameters
enable_range_optimization (bool)
Enable optimization for range queries (e.g., age > 25 AND age < 50).
Default: False
enable_extended_wildcard (bool)
Enable extended wildcard search including suffix and infix patterns.
False: Only prefix search (e.g.,"app*")True: Suffix and infix search (e.g.,"*ple","*pp*")
FalseNote: Extended wildcards increase index size
Use Cases
- Hybrid search: Combine with vector indexes for filtered retrieval
- Metadata filtering: Efficient WHERE clause execution
- Text search: Keyword and phrase matching
Example
Index Selection Guide
Comparison Table
| Feature | HNSW | IVF | Flat | Inverted |
|---|---|---|---|---|
| Search Type | ANN | ANN | Exact | Exact (text) |
| Recall | 95-99% | 90-95% | 100% | 100% |
| QPS (1M vectors) | 10K-50K | 5K-20K | 100-500 | 50K+ |
| Memory | High | Medium | Low | Low |
| Build Time | Fast | Medium | Instant | Fast |
| Training | No | Yes | No | No |
| Incremental Insert | Yes | Limited | Yes | Yes |
| Best For | Production | Large-scale | Small data | Filtering |
See Also
- Quantization - Reduce memory usage
- Performance Tuning - Optimization strategies
- API Reference - Complete parameter documentation