Skip to main content

Overview

The Doc class represents a document in Zvec with its metadata, vectors, and scalar fields. It is used both for inserting new documents and representing search results.

Class Definition

class Doc:
    """Represents a document with optional metadata, fields, and vectors."""

Attributes

id
str
required
Unique identifier for the document. Must be unique within a collection.
score
Optional[float]
default:"None"
Relevance score from search queries. Automatically populated by query operations. Higher scores indicate better matches.
vectors
Optional[dict[str, VectorType]]
default:"None"
Named vector embeddings associated with the document. Keys must match vector field names defined in the collection schema.Values can be:
  • Python lists: [0.1, 0.2, 0.3]
  • NumPy arrays: np.array([0.1, 0.2, 0.3]) (automatically converted to lists)
fields
Optional[dict[str, Any]]
default:"None"
Scalar metadata fields (e.g., title, category, timestamp). Keys must match field names in the collection schema.Supported value types depend on the field schema (strings, numbers, booleans, arrays).

Methods

vector()

Get a specific vector by name.
def vector(self, name: str) -> Optional[list[float]]
name
str
required
Name of the vector field to retrieve.
Returns: The vector as a list of floats, or None if not found.

field()

Get a specific scalar field by name.
def field(self, name: str) -> Optional[Any]
name
str
required
Name of the field to retrieve.
Returns: The field value, or None if not found.

Usage Examples

Creating Documents for Insertion

import zvec
import numpy as np

# Simple document with vector only
doc1 = zvec.Doc(
    id="doc_1",
    vectors={"embedding": [0.1, 0.2, 0.3, 0.4]}
)

# Document with metadata fields
doc2 = zvec.Doc(
    id="doc_2",
    vectors={"embedding": [0.2, 0.3, 0.4, 0.1]},
    fields={
        "title": "Getting Started with Zvec",
        "category": "tutorial",
        "views": 1250
    }
)

# Using NumPy arrays (automatically converted)
doc3 = zvec.Doc(
    id="doc_3",
    vectors={"embedding": np.array([0.3, 0.1, 0.2, 0.4])},
    fields={"title": "Advanced Topics"}
)

# Insert into collection
collection.insert([doc1, doc2, doc3])

Working with Search Results

# Query returns Doc objects with scores
results = collection.query(
    zvec.VectorQuery("embedding", vector=[0.4, 0.3, 0.3, 0.1]),
    topk=3
)

# Access result properties
for doc in results:
    print(f"ID: {doc.id}")
    print(f"Score: {doc.score}")
    print(f"Title: {doc.field('title')}")
    print(f"Vector: {doc.vector('embedding')}")

Multi-Vector Documents

# Document with multiple vectors
doc = zvec.Doc(
    id="doc_multi",
    vectors={
        "dense": [0.1, 0.2, 0.3, 0.4],
        "sparse": {0: 0.5, 10: 0.3, 50: 0.2}  # Sparse format
    },
    fields={"content": "Multi-vector document"}
)

collection.insert(doc)

Important Notes

Immutability: Doc objects are immutable. All attributes are set during initialization and cannot be modified afterward.
Automatic Conversion: NumPy arrays in the vectors dict are automatically converted to Python lists for JSON serialization and immutability.
ID Uniqueness: Document IDs must be unique within a collection. Inserting a document with an existing ID will fail unless using upsert().
Type Hints: Use type hints when creating documents to catch errors early:
from zvec import Doc

doc: Doc = zvec.Doc(id="example", vectors={"emb": [0.1, 0.2]})

See Also

Build docs developers (and LLMs) love