create_index()
Create an index on a vector or scalar field to accelerate queries.Signature
Parameters
Name of the field to index. Must exist in the collection schema.
Index configuration:Vector indices (for vector fields):
HnswIndexParam: HNSW graph-based index (recommended for most use cases)IVFIndexParam: Inverted file index (good for large datasets)FlatIndexParam: Brute-force search (exact results, no indexing overhead)
InvertIndexParam: Inverted index for fast filtering on scalar fields
Additional index creation options (e.g., build parallelism).
Returns
This method does not return a value. It raises an exception if index creation fails.Vector Index Example
Scalar Index Example
Vector indices can only be applied to vector fields, and inverted indices only to scalar fields. Attempting to create a vector index on a scalar field (or vice versa) will raise a
ValueError.drop_index()
Remove an index from a field. This does not delete the field itself, only its index.Signature
Parameters
Name of the indexed field.
Example
optimize()
Optimize the collection by merging segments, rebuilding indices, and reclaiming space.Signature
Parameters
Optimization options controlling the optimization process.
Returns
This method does not return a value.Example
When to Optimize
Runoptimize() after:
Large insertions
After inserting many documents (e.g., 10K+ documents), optimize to merge segments and improve query performance.
add_column()
Add a new column to the collection schema. Optionally populate it using an expression.Signature
Parameters
Schema definition for the new column (name, type, nullability).
SQL-like expression to compute initial values for existing documents.If empty, the new field will be
NULL for existing documents (if nullable) or raise an error (if not nullable).Options for the column addition operation.
Returns
This method does not return a value.Example
drop_column()
Remove a column from the collection schema.Signature
Parameters
Name of the column to drop.
Example
alter_column()
Rename a column or modify its schema. This operation only supports scalar numeric columns.Signature
Parameters
Current name of the column to alter.
New name for the column. If
None or empty, no renaming occurs.New schema definition. If
None, only renaming is performed.Options controlling the alteration behavior.
Supported Data Types
Example: Rename Column
Example: Modify Schema
Schema modification may trigger data migration or index rebuilds, which can be time-consuming for large collections.
get_stats()
Retrieve runtime statistics about the collection.Signature
Returns
A
CollectionStats object containing:doc_count: Number of documents in the collectiondisk_size: Total size on disk (in bytes)- Other internal metrics
Example
flush()
Force all pending writes to disk to ensure durability.Signature
Example
destroy()
Permanently delete the collection from disk.Signature
Example
Best Practices
Index Strategy
Start with HNSW
Use
HnswIndexParam for most vector fields. It provides excellent performance for datasets up to tens of millions of vectors.Use IVF for very large datasets
Switch to
IVFIndexParam if you have 100M+ vectors and memory is constrained.