Haystack integration overview

This module provides Haystack-based retrieval and RAG pipeline implementations across all five supported vector database backends. Every feature is organized as a self-contained directory with configuration files, indexing scripts, and search scripts for each backend.

What you get

Seventeen retrieval and RAG patterns implemented using Haystack’s pipeline and component abstractions
Full portability across Pinecone, Weaviate, Chroma, Milvus, and Qdrant (with feature-specific notes on backend support)
YAML-driven configuration with environment variable substitution so credentials stay out of code
Evaluation support via shared utils/evaluation.py metrics
Shared reusable components and helper factories that all feature pipelines draw from

Module structure

Each feature directory follows the same layout:

feature_name/
├── configs/
│   ├── chroma_triviaqa.yaml
│   ├── milvus_triviaqa.yaml
│   ├── pinecone_triviaqa.yaml
│   ├── qdrant_triviaqa.yaml
│   ├── weaviate_triviaqa.yaml
│   └── (one config per backend × dataset combination)
├── indexing/
│   ├── chroma.py
│   ├── milvus.py
│   ├── pinecone.py
│   ├── qdrant.py
│   └── weaviate.py
├── search/
│   ├── chroma.py
│   ├── milvus.py
│   ├── pinecone.py
│   ├── qdrant.py
│   └── weaviate.py
└── README.md

Feature catalog

Semantic search

Dense vector similarity search - the baseline pattern

Hybrid search

Combines dense + sparse embeddings with RRF fusion

Components

Reusable pipeline components for routing, compression, and query enhancement

Pipelines

Pipeline architecture patterns and composition

Core retrieval patterns

Feature	Description	Use when
semantic_search	Dense vector similarity search	Starting point and baseline
hybrid_indexing	Dense + sparse embeddings with RRF	Both semantic and keyword precision needed
sparse_indexing	Sparse-only retrieval	Pure keyword/lexical precision
reranking	Two-stage retrieval with cross-encoder	Better final ranking needed
mmr	Maximal Marginal Relevance	Relevant + diverse results
diversity_filtering	Similarity count filtering	Less redundant context

Advanced retrieval

Feature	Description	Use when
metadata_filtering	Structured constraints	Need to filter by attributes
json_indexing	JSON-native documents	Structured fields + semantic content
query_enhancement	Multi-query, HyDE, step-back	Better query recall
contextual_compression	Abstractive, extractive, relevance filtering	Shorter, cleaner context
parent_document_retrieval	Index children, retrieve parents	Long docs with fragment search

Production features

Feature	Description	Use when
cost_optimized_rag	Token/compute budget control	Cost reduction needed
agentic_rag	Multi-step iterative RAG with self-reflection	Complex multi-hop questions
multi_tenancy	Tenant-scoped isolation	Per-customer data isolation
namespaces	Logical segmentation	Environment separation

Embedding configuration

All Haystack feature pipelines read embedding configuration from the YAML config:

embeddings:
  model: "sentence-transformers/all-MiniLM-L6-v2"  # Required
  device: "cpu"                                      # Optional
  batch_size: 32                                      # Optional

For hybrid and sparse features:

sparse:
  model: "naver/splade-cocondenser-ensembledistil"

RAG configuration

Generation is controlled by the rag section:

rag:
  enabled: true
  model: "llama-3.3-70b-versatile"
  api_key: "${GROQ_API_KEY}"
  api_base_url: "https://api.groq.com/openai/v1"
  temperature: 0.7
  max_tokens: 2048

Set enabled: false to run retrieval-only pipelines without generation.

Recommended onboarding path

Run semantic search baseline

Start with semantic_search on your target backend with a small dataset limit (100-200 records) and verify the pipeline loads, indexes, and retrieves successfully.

Measure retrieval quality

Use evaluation_queries() and evaluate_retrieval() to establish baseline metrics.

Add improvements incrementally

Add one improvement feature at a time (e.g., reranking or hybrid_indexing) and measure whether quality improves on your evaluation set.

Add production features

Once the retrieval baseline is strong, adopt multi_tenancy or namespaces for data isolation, and cost_optimized_rag for budget controls.

Handle complex queries

Use agentic_rag or query_enhancement for hard multi-hop questions where single-pass retrieval falls short.

Supported backends

Chroma: Local embedded vector database with SQLite persistence
Milvus: High-performance distributed vector database
Pinecone: Managed vector database with native sparse vector support
Qdrant: Vector database with advanced filtering and multitenancy
Weaviate: GraphQL-based vector database with semantic search

Each backend has consistent API patterns through Haystack’s abstraction layer.

Next steps

Semantic search

Start with the baseline semantic search pattern

Hybrid search

Learn about dense + sparse hybrid retrieval

Components

Explore reusable pipeline components

Pipelines

Understand pipeline architecture patterns

Haystack

LangChain

Haystack integration overview

What you get

Module structure

Feature catalog

Semantic search

Hybrid search

Components

Pipelines

Core retrieval patterns

Advanced retrieval

Production features

Embedding configuration

RAG configuration

Recommended onboarding path

Supported backends

Next steps

Semantic search

Hybrid search

Components

Pipelines

Build docs developers (and LLMs) love

Haystack

LangChain

​What you get

​Module structure

​Feature catalog

Semantic search

Hybrid search

Components

Pipelines

​Core retrieval patterns

​Advanced retrieval

​Production features

​Embedding configuration

​RAG configuration

​Recommended onboarding path

​Supported backends

​Next steps

Semantic search

Hybrid search

Components

Pipelines

Build docs developers (and LLMs) love

What you get

Module structure

Feature catalog

Core retrieval patterns

Advanced retrieval

Production features

Embedding configuration

RAG configuration

Recommended onboarding path

Supported backends

Next steps