Namespaces

Namespaces partition data within a single collection or index, enabling logical separation without managing multiple collections. Use namespaces to separate development from production data, version document sets, or organize by content type.

Overview

Namespaces provide isolated logical partitions within a vector database. Each namespace contains its own subset of vectors and can be queried independently or in combination with other namespaces.

Namespaces are ideal when you need data isolation but want to avoid the overhead of managing separate collections or indexes.

Use cases

Environment separation

Separate dev, staging, and production data in the same index

Dataset versioning

Maintain multiple versions of embeddings or content

Content organization

Partition by document type, source, or category

A/B testing

Compare retrieval quality across different embedding models

Database strategies

Each vector database implements namespaces differently:

Database	Isolation Strategy	Implementation
Pinecone	Native namespace support	Index-level namespaces with automatic routing
Milvus	Partition-based isolation	Partition key field with filter expressions
Weaviate	Tenant mechanism	Native multi-tenancy with per-tenant shards
Qdrant	Collection-based	Separate collections with shared configuration
Chroma	Collection-based	Individual collections per namespace

Configuration

Namespace configuration varies by database:

pinecone:
  api_key: "your-api-key"
  index_name: "main-index"
  namespace: "production"  # Native namespace support

embedder:
  model: "sentence-transformers/all-MiniLM-L6-v2"

Usage example

from vectordb.langchain.namespaces.pinecone import (
    PineconeNamespacePipeline,
)

pipeline = PineconeNamespacePipeline(config)

# Create namespace
result = pipeline.create_namespace("production")
print(f"Created namespace: {result.namespace}")

# Check existence
if pipeline.namespace_exists("production"):
    print("Namespace exists")

# List all namespaces
namespaces = pipeline.list_namespaces()
print(f"Available namespaces: {namespaces}")

Management operations

Get namespace statistics

stats = pipeline.get_namespace_stats("production")

print(f"Document count: {stats.document_count}")
print(f"Vector count: {stats.vector_count}")
print(f"Size (bytes): {stats.size_bytes}")
print(f"Created: {stats.created_at}")
print(f"Updated: {stats.updated_at}")

Delete namespace

# Delete namespace and all its data
result = pipeline.delete_namespace("staging")

if result.success:
    print(f"Deleted namespace: {result.namespace}")
else:
    print(f"Error: {result.error}")

Cross-namespace search

Query multiple namespaces simultaneously with performance comparison:

from vectordb.langchain.namespaces import PineconeNamespacePipeline

pipeline = PineconeNamespacePipeline(config)

# Query all namespaces
results = pipeline.query_cross_namespace(
    query="machine learning fundamentals",
    namespaces=None,  # None = all namespaces
    top_k=10
)

# Results include timing metrics per namespace
for comparison in results.timing_comparisons:
    print(f"Namespace: {comparison.namespace}")
    print(f"  Query time: {comparison.query_time_ms}ms")
    print(f"  Results: {comparison.result_count}")
    print(f"  Avg score: {comparison.avg_score:.3f}")

Namespace isolation pipeline

Use dedicated search and indexing pipelines for namespace-scoped operations:

from vectordb.langchain.namespaces.search.pinecone import (
    PineconeNamespaceSearchPipeline,
)

# Create namespace-scoped search pipeline
search_pipeline = PineconeNamespaceSearchPipeline(
    config=config,
    namespace="production"
)

# All searches automatically scoped to namespace
results = search_pipeline.search(
    query="vector similarity",
    top_k=10
)

print(f"Namespace: {results['namespace']}")
for doc in results["documents"]:
    print(f"- {doc.page_content[:100]}")

Performance considerations

Pinecone namespaces

Pinecone provides native namespace support with minimal overhead. Namespaces share the same index resources and scale automatically. Best for 100,000+ namespaces.

Milvus partitions

Milvus uses partition keys for isolation, supporting millions of tenants. Partition-based queries are optimized and only scan relevant data. Enable partition key field during collection creation.

Weaviate tenants

Weaviate’s tenant mechanism provides strong isolation with per-tenant shards. Enterprise-grade performance with automatic tenant management. Ideal for strict data separation requirements.

Collection-based (Qdrant, Chroma)

Collection-based strategies create separate collections per namespace. More overhead but strongest isolation. Best for under 1000 namespaces.

Best practices

Choose the right strategy

Use Pinecone or Milvus for high namespace counts (10,000+), collection-based approaches for strong isolation needs.

Consistent naming

Use clear namespace naming conventions: {environment}_{version} or {team}_{dataset}.

Clean up regularly

Delete unused namespaces to reduce storage costs and maintain index performance.

Monitor per-namespace metrics

Track document counts and query latency per namespace to identify performance issues.

Multi-tenancy

Tenant-isolated indexing and retrieval

Metadata filtering

Structured constraints on retrieval

Hybrid search

Dense and sparse retrieval

Cost-optimized RAG

Efficient production pipelines

Getting Started

Core Concepts

Vector Databases

Retrieval Features

Advanced RAG

Data Management

Overview

Use cases

Environment separation

Dataset versioning

Content organization

A/B testing

Database strategies

Configuration

Usage example

Management operations

Get namespace statistics

Delete namespace

Cross-namespace search

Namespace isolation pipeline

Performance considerations

Best practices

Multi-tenancy

Metadata filtering

Hybrid search

Cost-optimized RAG

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Vector Databases

Retrieval Features

Advanced RAG

Data Management

​Overview

​Use cases

Environment separation

Dataset versioning

Content organization

A/B testing

​Database strategies

​Configuration

​Usage example

​Management operations

​Get namespace statistics

​Delete namespace

​Cross-namespace search

​Namespace isolation pipeline

​Performance considerations

​Best practices

​Related features

Multi-tenancy

Metadata filtering

Hybrid search

Cost-optimized RAG

Build docs developers (and LLMs) love

Overview

Use cases

Database strategies

Configuration

Usage example

Management operations

Get namespace statistics

Delete namespace

Cross-namespace search

Namespace isolation pipeline

Performance considerations

Best practices

Related features