RAG Pipeline Optimization Tutorial
Learn how to optimize Retrieval-Augmented Generation (RAG) systems using GEPA’s Generic RAG Adapter. This tutorial covers optimization across multiple vector stores including ChromaDB, Weaviate, Qdrant, Milvus, and LanceDB.Overview
The Generic RAG Adapter enables you to:- Optimize query reformulation, context synthesis, and answer generation prompts
- Switch between vector stores with a single flag
- Evaluate both retrieval quality (precision, recall, MRR) and generation quality (F1, BLEU, faithfulness)
- Deploy optimized prompts in production
Setup Vector Store
Choose and initialize a vector store. ChromaDB is easiest to start with:Other vector store options:
Create Initial Prompts
Define baseline prompts to optimize:GEPA will evolve this into task-specific, highly effective prompts.
Run GEPA Optimization
Optimize RAG prompts using GEPA:Typical improvements:
- Initial score: 0.35-0.50
- Optimized score: 0.60-0.80
- Improvement: +0.1 to +0.4 points
Complete Working Example
Use the unified RAG optimization script:Vector Store Comparison
ChromaDB
Best for: Local development, prototyping
- ✅ No Docker required
- ✅ Simple setup
- ✅ Local persistent storage
- Use:
--vector-store chromadb
LanceDB
Best for: Serverless deployments
- ✅ No Docker required
- ✅ Developer-friendly
- ✅ Columnar format performance
- Use:
--vector-store lancedb
Qdrant
Best for: High performance, filtering
- ✅ No Docker (in-memory mode)
- ✅ Advanced metadata filtering
- ✅ Payload search
- Use:
--vector-store qdrant
Weaviate
Best for: Production, hybrid search
- ⚠️ Requires Docker
- ✅ Hybrid semantic + keyword search
- ✅ Production-ready clustering
- Use:
--vector-store weaviate
Evaluation Metrics
The RAG adapter tracks comprehensive metrics:Retrieval Quality
- Precision: Fraction of retrieved documents that are relevant
- Recall: Fraction of relevant documents that were retrieved
- F1 Score: Harmonic mean of precision and recall
- MRR: Mean Reciprocal Rank for ranking quality
Generation Quality
- Token F1: Token overlap with ground truth
- BLEU Score: N-gram similarity measure
- Answer Relevance: How well answer relates to context
- Faithfulness: How well answer is supported by context
Combined Score
rag_config:
Optimizable Components
You can optimize multiple RAG components simultaneously:Advanced Configuration
Hybrid Search (Weaviate)
Metadata Filtering
Production Deployment
Troubleshooting
Vector store connection errors
Vector store connection errors
ChromaDB: No external dependencies, should work out of boxWeaviate: Ensure Docker is runningQdrant: In-memory mode requires no setup; for server mode:
Low retrieval scores
Low retrieval scores
- Increase
top_kto retrieve more documents - Check that
relevant_doc_idsin training data are correct - Ensure documents are properly indexed in vector store
- Try different retrieval strategies (similarity vs. hybrid)
Poor generation quality
Poor generation quality
- Verify ground truth answers are high quality
- Increase
generation_weightin config - Use a stronger LLM for generation
- Add more diverse training examples
Next Steps
RAG Adapter API
Complete API reference for the Generic RAG Adapter
Vector Store Guide
Detailed setup for all supported vector stores
Agent Architecture
Optimize entire agent systems beyond RAG
Production Examples
Real-world RAG deployments using GEPA