Skip to main content

VectorDB

What is VectorDB?

VectorDB provides a unified, production-oriented toolkit for Semantic Search and Retrieval-Augmented Generation across five vector databases, with feature parity between Haystack and LangChain. It ships ready-to-run pipelines for Dense, Sparse, and Hybrid Retrieval, plus advanced RAG capabilities like Reranking, Query Enhancement, Contextual Compression, Parent-Child Retrieval, and Agentic Retrieval Loops. The design is configuration-driven, environment-variable friendly, and built for consistent benchmarking across databases and datasets.
Use VectorDB to build, compare, and deploy retrieval systems without re-implementing logic per backend.

Supported vector databases

Pinecone

Managed vector database with namespaces and native sparse-dense hybrid retrieval

Weaviate

Open-source vector search with BM25 hybrid retrieval, collections, and multi-tenancy

Qdrant

High-performance search with payload filtering and scalar or binary quantization

Milvus

Scalable vector database with partition-key isolation and hybrid retrieval

Chroma

Lightweight vector store for local development and rapid prototyping

Key features

Semantic search

Dense vector retrieval with metadata filters and optional answer generation

Hybrid search

Dense + sparse retrieval fused with RRF or weighted scoring

Reranking

Two-stage retrieval using cross-encoders for higher precision

Query enhancement

Multi-query, HyDE, and step-back prompting to improve recall

Contextual compression

Reduce retrieved context via reranking or LLM extraction

Parent document retrieval

Index chunks but return parent documents or context windows

Multi-tenancy

Tenant isolation using database-specific strategies at scale

Agentic RAG

Iterative retrieval loop with search, reflect, and refine steps

Built-in datasets and evaluation

VectorDB includes dataset loaders and standardized evaluation utilities so you can benchmark retrieval quality across databases and frameworks. Supported datasets:
  • TriviaQA - Open-domain question-answer pairs for general knowledge retrieval
  • ARC - Science reasoning questions requiring multi-hop inference
  • PopQA - Factoid questions about popular entities
  • FactScore - Atomic facts for verification and hallucination detection
  • Earnings Calls - Financial transcript Q&A for domain-specific RAG
Built-in evaluation metrics:
  • Recall@k
  • Precision@k
  • MRR (Mean Reciprocal Rank)
  • NDCG@k
  • Hit rate

Get started

Installation

Install VectorDB using uv package manager

Quickstart

Build your first RAG pipeline in minutes

Build docs developers (and LLMs) love