Overview
TheInvertIndexParam class configures an inverted index (also called an invert index) for sparse vector indexing and text search. Unlike dense vector indexes (HNSW, IVF, Flat), inverted indexes are optimized for sparse vectors commonly used in keyword search, BM25, and hybrid retrieval scenarios.
Constructor
Parameters
Whether to enable range query optimization for the inverted index.When enabled, range queries (e.g., finding documents with values in a specific range) are optimized with specialized data structures. This can significantly improve performance for numeric range queries.
Whether to enable extended wildcard search including suffix and infix patterns.Wildcard patterns:
- Prefix search (e.g.,
"test*") is always enabled regardless of this setting - Suffix search (e.g.,
"*test") requires this setting to be enabled - Infix search (e.g.,
"*test*") requires this setting to be enabled
- Enabling this increases index size and build time
- Provides more flexible text search capabilities
- Recommended for text search and fuzzy matching use cases
Properties
enable_range_optimization
Whether range optimization is enabled for this inverted index. Type:bool
enable_extended_wildcard
Whether extended wildcard (suffix and infix) search is enabled. Type:bool
Methods
to_dict()
Convert the index parameters to a dictionary representation. Returns:dict - Dictionary with all index parameter fields.
Examples
Basic inverted index
Enable range optimization
Enable extended wildcard search
Full-featured text search index
Using with a collection
Use Cases
1. Sparse Vector Search
Inverted indexes are optimized for sparse vectors used in information retrieval:2. Keyword Search
Enable wildcard patterns for flexible text matching:3. Hybrid Retrieval
Combine dense and sparse vector search:4. Document Filtering
Use range optimization for numeric filters:Performance Characteristics
Time Complexity
- Search time: O(k * log(n))
- k = number of non-zero dimensions in query
- n = number of documents
- Index build time: O(m * d)
- m = number of documents
- d = average number of non-zero dimensions
- Insert time: O(d * log(n))
- d = number of non-zero dimensions
Space Complexity
- Memory: Proportional to total number of non-zero entries across all documents
- Much more efficient than dense indexes for sparse data
When to Use Inverted Index
Inverted index is ideal for:
- Sparse vector search (BM25, TF-IDF)
- Keyword and text search
- Hybrid retrieval (combining dense and sparse)
- Document filtering with numeric ranges
- High-dimensional sparse data (e.g., 10,000+ dimensions with <1% non-zero)
Comparison: Dense vs Sparse Indexes
| Feature | Inverted (Sparse) | HNSW/IVF (Dense) |
|---|---|---|
| Best for | Sparse vectors | Dense vectors |
| Memory | Only non-zero values | All dimensions |
| Typical use | Text, keywords | Semantic search |
| Dimensions | 1000s-100,000s | 100-2000 |
| Sparsity | >95% zeros | <10% zeros |
Wildcard Search Examples
Prefix Search (Always Enabled)
Suffix and Infix Search (Requires enable_extended_wildcard)
Optimization Trade-offs
Range optimization:
- Adds ~10-20% to index size
- Improves range query performance by 5-10x
- Recommended if you have numeric filters
Extended wildcard:
- Adds ~30-50% to index size
- Increases index build time by ~2x
- Enables flexible text matching patterns
- Recommended for text search applications
Sparse Vector Format
Sparse vectors are typically represented as dictionaries or lists of (index, value) pairs:Hybrid Search Pattern
Combine inverted index (sparse) with dense vector index for best results:See Also
- HnswIndexParam - For dense vector search
- IVFIndexParam - For large-scale dense vectors
- VectorQuery - Querying with inverted indexes