Skip to main content

Overview

The Collection class is the primary interface for interacting with your data in Zvec. It provides methods for:
  • Data Manipulation (DML): Insert, update, upsert, and delete documents
  • Data Querying (DQL): Vector similarity search, filtering, and document retrieval
  • Schema Management (DDL): Create indices, add/drop columns, optimize collections
A Collection instance is obtained via zvec.create_and_open() or zvec.open() — you should never instantiate it directly.

Creating a Collection

import zvec

# Create and open a new collection
collection = zvec.create_and_open(
    path="./my_collection",
    schema=my_schema
)

# Or open an existing collection
collection = zvec.open("./my_collection")

Available Methods

Data Manipulation

insert()

Add new documents to the collection

update()

Modify existing documents by ID

upsert()

Insert or update documents

delete()

Remove documents by ID or filter

Data Querying

query()

Perform vector similarity search

fetch()

Retrieve documents by ID

Schema Management

create_index()

Build indices on vector or scalar fields

add_column()

Add a new field to the schema

alter_column()

Rename or modify existing columns

optimize()

Merge segments and rebuild indices

Properties

path
str
required
The filesystem path where the collection is stored.
schema
CollectionSchema
required
The schema defining the structure of the collection.
option
CollectionOption
required
The options used to open the collection.
stats
CollectionStats
required
Runtime statistics about the collection (e.g., document count, size).

Lifecycle Management

Flushing Data

Force all pending writes to disk to ensure durability:
collection.flush()

Destroying a Collection

This operation is irreversible. All data will be permanently deleted.
collection.destroy()

Example Usage

import zvec
from zvec import Doc, VectorQuery

# Open collection
collection = zvec.open("./documents")

# Insert documents
docs = [
    Doc(id="1", vectors={"embedding": [0.1, 0.2, 0.3]}, fields={"title": "Hello"}),
    Doc(id="2", vectors={"embedding": [0.4, 0.5, 0.6]}, fields={"title": "World"})
]
statuses = collection.insert(docs)

# Query similar documents
results = collection.query(
    vectors=VectorQuery("embedding", vector=[0.15, 0.25, 0.35]),
    topk=5,
    filter="title != 'spam'"
)

# Flush and cleanup
collection.flush()

See Also

Build docs developers (and LLMs) love