Skip to main content
This module provides LangChain-based retrieval and RAG pipeline implementations across all five supported vector database backends. Every feature is organized as a self-contained directory with configuration files, indexing scripts, and search scripts for each backend.

What you get

  • Seventeen retrieval and RAG patterns implemented using LangChain’s retriever, chain, and document store abstractions
  • Full portability across Pinecone, Weaviate, Chroma, Milvus, and Qdrant with feature-specific notes on backend support
  • YAML-driven configuration with environment variable substitution so credentials stay out of code
  • Evaluation support via shared utils/evaluation.py metrics
  • Shared reusable components (components/) and helper factories (utils/) that all feature pipelines draw from

Vector database support

All pipelines support five backends:
  • Pinecone: Managed vector database with hybrid search capabilities
  • Weaviate: Open-source vector search with schema-based filtering
  • Chroma: Embedded database for prototyping and local use
  • Milvus: Scalable vector search with partition support
  • Qdrant: High-performance search with payload indexing

Module structure

Each feature directory follows the same layout:
feature_name/
├── configs/
│   ├── chroma_triviaqa.yaml
│   ├── milvus_triviaqa.yaml
│   ├── pinecone_triviaqa.yaml
│   ├── qdrant_triviaqa.yaml
│   ├── weaviate_triviaqa.yaml
│   └── (one config per backend × dataset combination)
├── indexing/
│   ├── chroma.py
│   ├── milvus.py
│   ├── pinecone.py
│   ├── qdrant.py
│   └── weaviate.py
├── search/
│   ├── chroma.py
│   ├── milvus.py
│   ├── pinecone.py
│   ├── qdrant.py
│   └── weaviate.py
└── README.md
Indexing scripts load a dataset, embed documents using EmbedderHelper, and upsert them into the target backend. Search scripts embed a query, retrieve candidates, apply post-retrieval processing, and optionally generate an answer using RAGHelper.

Feature catalog

Semantic search

Dense vector similarity search with HuggingFace embeddings

Hybrid search

Combine dense and sparse embeddings with Reciprocal Rank Fusion

Reranking

Two-stage retrieval with HuggingFace cross-encoder models

MMR diversity

Maximal Marginal Relevance for balancing relevance and diversity

Query enhancement

Multi-query, HyDE, and step-back prompting for better recall

Contextual compression

Compress retrieved documents to query-relevant fragments

Agentic RAG

Multi-step iterative RAG with reflection and routing

Metadata filtering

Structured filter constraints applied at query time

Multi-tenancy

Tenant-scoped indexing and retrieval with isolation

Namespaces

Logical data partitioning within shared indexes

Parent document retrieval

Index child chunks, return parent documents

JSON indexing

Structured fields from JSON preserved as metadata

Embedding configuration

All LangChain feature pipelines read embedding configuration from YAML:
embeddings:
  model: "sentence-transformers/all-MiniLM-L6-v2"  # Required: full model path
  device: "cpu"                                       # Optional: "cpu" or "cuda"
  batch_size: 32                                      # Optional
For hybrid and sparse features, also include:
sparse:
  model: "naver/splade-cocondenser-ensembledistil"   # Required for sparse embedder

RAG configuration

Generation is controlled by the rag section:
rag:
  enabled: true
  model: "llama-3.3-70b-versatile"
  api_key: "${GROQ_API_KEY}"
  temperature: 0.7
  max_tokens: 2048
The LangChain RAGHelper uses ChatGroq for generation. Set enabled: false to run retrieval-only pipelines.
1

Baseline semantic search

Run semantic_search on your target backend with a small dataset limit and verify the pipeline completes successfully.
2

Measure baseline metrics

Extract evaluation queries from the dataset and measure baseline retrieval metrics.
3

Add improvements

Add one improvement feature at a time — start with reranking (usually the highest single-step gain) or hybrid_indexing (for mixed query types).
4

Add data isolation

Once quality is stable, layer in multi_tenancy or namespaces for data isolation.
5

Optimize costs

Use cost_optimized_rag to find acceptable quality-cost tradeoffs, and agentic_rag for complex multi-step reasoning tasks.

Feature selection guide

If you need…Use
Starting point and baselinesemantic_search
Both semantic and keyword precisionhybrid_indexing
Pure keyword/lexical precisionsparse_indexing
Better final rankingreranking
Relevant + diverse result setmmr
Less redundant contextdiversity_filtering
Structured constraintsmetadata_filtering
JSON-native documentsjson_indexing
Better query recallquery_enhancement
Shorter, cleaner contextcontextual_compression
Token/cost budget controlcost_optimized_rag
Iterative multi-step reasoningagentic_rag
Long docs with fragment searchparent_document_retrieval
Per-customer data isolationmulti_tenancy
Logical data segmentationnamespaces

LangChain vs Haystack

Both frameworks provide similar capabilities but with different design philosophies:
LangChain: Component-oriented with chains and runnables. Uses Document objects and retriever interfaces.Haystack: Pipeline-oriented with nodes and pipelines. Uses Document objects and node interfaces.
LangChain: Native integration with ChatGroq, OpenAI, Anthropic via langchain-* packages.Haystack: Integration via generator nodes and prompt builders.
LangChain: HuggingFaceEmbeddings from langchain-huggingface.Haystack: SentenceTransformersDocumentEmbedder and SentenceTransformersTextEmbedder.
Choose LangChain if you prefer chain composition and already use LangChain in your stack. Choose Haystack for pipeline-based workflows and deeper integration with Hugging Face models.

Next steps

Semantic search

Start with baseline dense vector retrieval

Hybrid search

Combine dense and sparse for robust retrieval

Components

Explore reusable LangChain components

Chains

Build agentic RAG with routing and reflection

Build docs developers (and LLMs) love