Skip to main content

Overview

CollectionSchema defines the complete structure of a collection in Zvec. It specifies the collection name and its fields, including both scalar fields (e.g., integers, strings) and vector fields for similarity search.
Field names must be unique across both scalar and vector fields within a collection.

Constructor

CollectionSchema(
    name: str,
    fields: Optional[Union[FieldSchema, list[FieldSchema]]] = None,
    vectors: Optional[Union[VectorSchema, list[VectorSchema]]] = None
)

Parameters

name
str
required
Name of the collection. Must be a non-empty string.
fields
FieldSchema | list[FieldSchema]
One or more scalar field definitions. Can be a single FieldSchema instance or a list of them. Defaults to None.
vectors
VectorSchema | list[VectorSchema]
One or more vector field definitions. Can be a single VectorSchema instance or a list of them. Defaults to None.

Raises

  • TypeError: If fields or vectors are of unsupported types
  • ValueError: If any field or vector name is duplicated

Properties

name
str
The name of the collection (read-only).
fields
list[FieldSchema]
All scalar (non-vector) fields in the schema (read-only).
vectors
list[VectorSchema]
All vector fields in the schema (read-only).

Methods

field()

Retrieve a scalar field by name.
def field(name: str) -> Optional[FieldSchema]
name
str
required
Name of the field to retrieve.
Returns: The FieldSchema if found, otherwise None.

vector()

Retrieve a vector field by name.
def vector(name: str) -> Optional[VectorSchema]
name
str
required
Name of the vector field to retrieve.
Returns: The VectorSchema if found, otherwise None.

Examples

Basic collection with single fields

from zvec import CollectionSchema, FieldSchema, VectorSchema, DataType

# Create a simple collection schema
schema = CollectionSchema(
    name="my_collection",
    fields=FieldSchema("id", DataType.INT64),
    vectors=VectorSchema("embedding", DataType.VECTOR_FP32, dimension=128)
)

print(schema.name)  # "my_collection"
print(len(schema.fields))  # 1
print(len(schema.vectors))  # 1

Collection with multiple fields

from zvec import CollectionSchema, FieldSchema, VectorSchema, DataType
from zvec.model.param import HnswIndexParam

# Define multiple scalar fields
fields = [
    FieldSchema("id", DataType.INT64),
    FieldSchema("title", DataType.STRING),
    FieldSchema("author", DataType.STRING, nullable=True),
    FieldSchema("created_at", DataType.INT64)
]

# Define multiple vector fields
vectors = [
    VectorSchema(
        "text_embedding",
        DataType.VECTOR_FP32,
        dimension=384,
        index_param=HnswIndexParam(m=16, ef_construction=200)
    ),
    VectorSchema(
        "image_embedding",
        DataType.VECTOR_FP16,
        dimension=512
    )
]

schema = CollectionSchema(
    name="documents",
    fields=fields,
    vectors=vectors
)

print(schema)

Accessing schema fields

# Retrieve a specific field
id_field = schema.field("id")
if id_field:
    print(f"Field: {id_field.name}, Type: {id_field.data_type}")

# Retrieve a specific vector
text_vector = schema.vector("text_embedding")
if text_vector:
    print(f"Vector: {text_vector.name}, Dimension: {text_vector.dimension}")

# Iterate over all fields
for field in schema.fields:
    print(f"Scalar field: {field.name}")

# Iterate over all vectors
for vector in schema.vectors:
    print(f"Vector field: {vector.name}")

Using with create_and_open

import zvec

schema = zvec.CollectionSchema(
    name="products",
    fields=[
        zvec.FieldSchema("product_id", zvec.DataType.STRING),
        zvec.FieldSchema("price", zvec.DataType.FLOAT),
        zvec.FieldSchema("category", zvec.DataType.STRING)
    ],
    vectors=zvec.VectorSchema(
        "description_embedding",
        zvec.DataType.VECTOR_FP32,
        dimension=768
    )
)

# Create collection with this schema
collection = zvec.create_and_open(
    path="./product_db",
    schema=schema
)

See Also

Build docs developers (and LLMs) love