Insert Operations

insert()

Insert new documents into the collection. Documents must have unique IDs and conform to the collection schema.

Signature

def insert(self, docs: Union[Doc, list[Doc]]) -> Union[Status, list[Status]]

Parameters

docs

Union[Doc, list[Doc]]

required

One or more documents to insert. Each document must:

Have a unique ID not already in the collection
Include all required fields defined in the schema
Have vectors matching the schema dimensions

Returns

Status

Union[Status, list[Status]]

If a single Doc is provided: returns a single Status object
If a list is provided: returns a list[Status] with one status per document

Each Status indicates success or failure for that document.

Basic Example

import zvec
from zvec import Doc

# Insert a single document
doc = Doc(
    id="doc_001",
    vectors={"embedding": [0.1, 0.2, 0.3, 0.4]},
    fields={"title": "Getting Started", "category": "tutorial"}
)

status = collection.insert(doc)
if status.ok():
    print("Document inserted successfully")
else:
    print(f"Insert failed: {status.message()}")

Batch Insertion

For better performance, insert multiple documents at once:

docs = [
    Doc(
        id=f"doc_{i}",
        vectors={"embedding": [0.1 * i, 0.2 * i, 0.3 * i, 0.4 * i]},
        fields={"title": f"Document {i}", "score": i * 10}
    )
    for i in range(1000)
]

statuses = collection.insert(docs)

# Check results
success_count = sum(1 for s in statuses if s.ok())
print(f"Successfully inserted {success_count}/{len(docs)} documents")

Document Format Requirements

Documents must adhere to the collection’s schema:

Vector fields: Must match the defined dimension and data type
Scalar fields: Must match the defined data type (INT64, STRING, etc.)
Nullable fields: Can be omitted or set to None
Non-nullable fields: Must be provided

# Schema definition
schema = zvec.CollectionSchema(
    name="articles",
    vectors=zvec.VectorSchema("embedding", zvec.DataType.VECTOR_FP32, dimension=128),
    fields=[
        zvec.FieldSchema("title", zvec.DataType.STRING, nullable=False),
        zvec.FieldSchema("views", zvec.DataType.INT64, nullable=True)
    ]
)

# Valid document
doc = Doc(
    id="1",
    vectors={"embedding": [0.1] * 128},
    fields={"title": "Required field", "views": 100}  # views can be omitted
)

upsert()

Insert new documents or update existing ones by ID. If a document with the given ID exists, it will be updated; otherwise, it will be inserted.

Signature

def upsert(self, docs: Union[Doc, list[Doc]]) -> Union[Status, list[Status]]

Parameters

docs

Union[Doc, list[Doc]]

required

One or more documents to upsert. Documents with existing IDs will be updated; new IDs will be inserted.

Returns

Status

Union[Status, list[Status]]

Single Status if one document was provided
list[Status] if multiple documents were provided

Example

# First insert
doc = Doc(id="user_123", vectors={"profile_emb": [0.1, 0.2]}, fields={"name": "Alice"})
collection.upsert(doc)

# Update the same document (overwrites all fields)
updated_doc = Doc(
    id="user_123",
    vectors={"profile_emb": [0.3, 0.4]},
    fields={"name": "Alice Smith"}
)
collection.upsert(updated_doc)

upsert() replaces the entire document. To update specific fields only, use update() instead.

Error Handling

Common Insert Errors

from zvec import StatusCode

status = collection.insert(doc)

if not status.ok():
    if status.code() == StatusCode.ALREADY_EXISTS:
        print("Document ID already exists")
    elif status.code() == StatusCode.INVALID_ARGUMENT:
        print("Invalid document format or schema mismatch")
    else:
        print(f"Error: {status.message()}")

Handling Batch Failures

docs = [...] # List of documents
statuses = collection.insert(docs)

# Find failed inserts
failed_docs = [
    (doc, status) for doc, status in zip(docs, statuses) if not status.ok()
]

if failed_docs:
    print(f"{len(failed_docs)} documents failed to insert:")
    for doc, status in failed_docs:
        print(f"  ID {doc.id}: {status.message()}")

Performance Tips

Batch your inserts: Inserting 1000 documents in a single call is much faster than 1000 individual calls.

# Good: Batch insertion
docs = [create_doc(i) for i in range(10000)]
collection.insert(docs)

# Bad: Individual insertions
for i in range(10000):
    collection.insert(create_doc(i))

Flush periodically: Call collection.flush() after large batches to ensure durability.

for batch_start in range(0, len(all_docs), 1000):
    batch = all_docs[batch_start:batch_start + 1000]
    collection.insert(batch)
    
if batch_start % 10000 == 0:
    collection.flush()  # Flush every 10k documents

Initialization

Collection

Schema Types

Query Types

Index Parameters

Embedding Functions

Re-ranking

Types & Enums

Insert Operations

insert()

Signature

Parameters

Returns

Basic Example

Batch Insertion

Document Format Requirements

upsert()

Signature

Parameters

Returns

Example

Error Handling

Common Insert Errors

Handling Batch Failures

Performance Tips

See Also

Build docs developers (and LLMs) love

Initialization

Collection

Schema Types

Query Types

Index Parameters

Embedding Functions

Re-ranking

Types & Enums

​insert()

​Signature

​Parameters

​Returns

​Basic Example

​Batch Insertion

​Document Format Requirements

​upsert()

​Signature

​Parameters

​Returns

​Example

​Error Handling

​Common Insert Errors

​Handling Batch Failures

​Performance Tips

​See Also

Build docs developers (and LLMs) love

insert()

Signature

Parameters

Returns

Basic Example

Batch Insertion

Document Format Requirements

upsert()

Signature

Parameters

Returns

Example

Error Handling

Common Insert Errors

Handling Batch Failures

Performance Tips

See Also