Overview
IndexType defines the indexing algorithm used for efficient vector search. Different index types offer various trade-offs between search speed, accuracy, memory usage, and build time.
Available Index Types
No index specified. Uses default indexing behavior.When to use: Let Zvec choose the appropriate index automatically.
Flat index (brute-force search). Exhaustive search with 100% recall.Characteristics:
- Perfect recall (100%)
- Linear search time O(n)
- No index build time
- Minimal memory overhead
- Small datasets (< 10K vectors)
- When perfect recall is required
- Baseline benchmarking
- Development and testing
Hierarchical Navigable Small World. Graph-based approximate nearest neighbor index.Characteristics:
- Very fast search (logarithmic)
- High recall (> 95% typical)
- Higher memory usage
- Fast index build
- Good for general-purpose use
- Medium to large datasets (10K - 100M+ vectors)
- When fast queries are critical
- General-purpose vector search
- Most production deployments
Inverted File Index. Clustering-based approximate search.Characteristics:
- Fast search with tunable accuracy
- Lower memory than HNSW
- Longer index build time
- Good for very large datasets
- Very large datasets (100M+ vectors)
- Memory-constrained environments
- When recall can be traded for speed
- Batch processing scenarios
Inverted index. Optimized for sparse vectors (e.g., BM25).Characteristics:
- Designed for sparse vectors
- Fast keyword-style search
- Memory-efficient for sparse data
- Supports hybrid search
- Sparse vector fields
- BM25 or TF-IDF embeddings
- Keyword search
- Hybrid dense-sparse search
Index Properties
AllIndexType enum members have these properties:
The name of the index type as a string.
The internal integer value of the index type.
Usage Examples
Basic Index Definition
Multi-Field with Different Index Types
Index Comparison
- Performance
- Scalability
- Trade-offs
| Index Type | Build Time | Query Speed | Memory | Recall |
|---|---|---|---|---|
| FLAT | Instant | Slow (linear) | Minimal | 100% |
| HNSW | Fast | Very fast | High | > 95% |
| IVF | Slow | Fast | Medium | 90-95% |
| INVERT | Fast | Fast | Low (sparse) | 100% |
Performance varies based on dataset size, dimensionality, and configuration.
Choosing the Right Index
Determine Your Dataset Size
| Vectors | Recommended Index |
|---|---|
| < 10K | FLAT |
| 10K - 10M | HNSW |
| 10M - 100M | HNSW or IVF |
| > 100M | IVF |
| Sparse | INVERT |
Consider Your Requirements
Choose based on priorities:Speed priority: HNSW > IVF > FLATMemory priority: IVF > HNSW > FLATAccuracy priority: FLAT > HNSW > IVFBuild time priority: FLAT > HNSW > IVF
Index Configuration
Each index type can be tuned with additional parameters:HNSW Parameters
IVF Parameters
Hybrid Search with Multiple Indexes
Best Practices
Performance Tips
Optimizing HNSW
Optimizing HNSW
- Increase
mfor better recall (use 16-64) - Increase
ef_constructionfor better index quality (use 100-500) - Increase
ef_searchfor better query recall (use 32-128) - Monitor memory usage as these increase index size
Optimizing IVF
Optimizing IVF
- Set
nlist = sqrt(num_vectors)as starting point - Increase
nprobefor better recall (1-50) - Use quantization (INT8, INT4) to reduce memory
- Pre-train clusters on representative data
Optimizing FLAT
Optimizing FLAT
- Use only for small datasets (< 10K)
- Consider GPU acceleration for larger FLAT searches
- Use as baseline to measure ANN index quality
See Also
- Field Definition - Schema field configuration
- DataType - Vector data types
- MetricType - Distance metrics
- QuantizeType - Vector quantization
- Performance Tuning - Optimization strategies