Skip to main content
Namespaces partition data within a single collection or index, enabling logical separation without managing multiple collections. Use namespaces to separate development from production data, version document sets, or organize by content type.

Overview

Namespaces provide isolated logical partitions within a vector database. Each namespace contains its own subset of vectors and can be queried independently or in combination with other namespaces.
Namespaces are ideal when you need data isolation but want to avoid the overhead of managing separate collections or indexes.

Use cases

Environment separation

Separate dev, staging, and production data in the same index

Dataset versioning

Maintain multiple versions of embeddings or content

Content organization

Partition by document type, source, or category

A/B testing

Compare retrieval quality across different embedding models

Database strategies

Each vector database implements namespaces differently:
DatabaseIsolation StrategyImplementation
PineconeNative namespace supportIndex-level namespaces with automatic routing
MilvusPartition-based isolationPartition key field with filter expressions
WeaviateTenant mechanismNative multi-tenancy with per-tenant shards
QdrantCollection-basedSeparate collections with shared configuration
ChromaCollection-basedIndividual collections per namespace

Configuration

Namespace configuration varies by database:
pinecone:
  api_key: "your-api-key"
  index_name: "main-index"
  namespace: "production"  # Native namespace support

embedder:
  model: "sentence-transformers/all-MiniLM-L6-v2"

Usage example

from vectordb.langchain.namespaces.pinecone import (
    PineconeNamespacePipeline,
)

pipeline = PineconeNamespacePipeline(config)

# Create namespace
result = pipeline.create_namespace("production")
print(f"Created namespace: {result.namespace}")

# Check existence
if pipeline.namespace_exists("production"):
    print("Namespace exists")

# List all namespaces
namespaces = pipeline.list_namespaces()
print(f"Available namespaces: {namespaces}")

Management operations

Get namespace statistics

stats = pipeline.get_namespace_stats("production")

print(f"Document count: {stats.document_count}")
print(f"Vector count: {stats.vector_count}")
print(f"Size (bytes): {stats.size_bytes}")
print(f"Created: {stats.created_at}")
print(f"Updated: {stats.updated_at}")

Delete namespace

# Delete namespace and all its data
result = pipeline.delete_namespace("staging")

if result.success:
    print(f"Deleted namespace: {result.namespace}")
else:
    print(f"Error: {result.error}")
Query multiple namespaces simultaneously with performance comparison:
from vectordb.langchain.namespaces import PineconeNamespacePipeline

pipeline = PineconeNamespacePipeline(config)

# Query all namespaces
results = pipeline.query_cross_namespace(
    query="machine learning fundamentals",
    namespaces=None,  # None = all namespaces
    top_k=10
)

# Results include timing metrics per namespace
for comparison in results.timing_comparisons:
    print(f"Namespace: {comparison.namespace}")
    print(f"  Query time: {comparison.query_time_ms}ms")
    print(f"  Results: {comparison.result_count}")
    print(f"  Avg score: {comparison.avg_score:.3f}")

Namespace isolation pipeline

Use dedicated search and indexing pipelines for namespace-scoped operations:
from vectordb.langchain.namespaces.search.pinecone import (
    PineconeNamespaceSearchPipeline,
)

# Create namespace-scoped search pipeline
search_pipeline = PineconeNamespaceSearchPipeline(
    config=config,
    namespace="production"
)

# All searches automatically scoped to namespace
results = search_pipeline.search(
    query="vector similarity",
    top_k=10
)

print(f"Namespace: {results['namespace']}")
for doc in results["documents"]:
    print(f"- {doc.page_content[:100]}")

Performance considerations

Pinecone provides native namespace support with minimal overhead. Namespaces share the same index resources and scale automatically. Best for 100,000+ namespaces.
Milvus uses partition keys for isolation, supporting millions of tenants. Partition-based queries are optimized and only scan relevant data. Enable partition key field during collection creation.
Weaviate’s tenant mechanism provides strong isolation with per-tenant shards. Enterprise-grade performance with automatic tenant management. Ideal for strict data separation requirements.
Collection-based strategies create separate collections per namespace. More overhead but strongest isolation. Best for under 1000 namespaces.

Best practices

1

Choose the right strategy

Use Pinecone or Milvus for high namespace counts (10,000+), collection-based approaches for strong isolation needs.
2

Consistent naming

Use clear namespace naming conventions: {environment}_{version} or {team}_{dataset}.
3

Clean up regularly

Delete unused namespaces to reduce storage costs and maintain index performance.
4

Monitor per-namespace metrics

Track document counts and query latency per namespace to identify performance issues.

Multi-tenancy

Tenant-isolated indexing and retrieval

Metadata filtering

Structured constraints on retrieval

Hybrid search

Dense and sparse retrieval

Cost-optimized RAG

Efficient production pipelines

Build docs developers (and LLMs) love