What is RAG?
RAG workflows:- Ingest documents into a vector database
- Retrieve relevant chunks based on user queries
- Augment LLM prompts with retrieved context
- Generate responses grounded in your data
Basic RAG Pattern
Simple RAG with Agno
Fromrag_apps/agentic_rag/main.py:
How It Works
- Ingestion:
knowledge_base.load()downloads URLs, chunks content, generates embeddings - Storage: Embeddings stored in LanceDB vector database
- Retrieval: When user queries, relevant chunks retrieved via vector search
- Generation: Agent uses retrieved context to generate accurate response
Contextual AI RAG
Contextual AI provides advanced RAG with document understanding and contextual embeddings. Fromrag_apps/contextual_ai_rag/main.py:
Enhanced RAG with Multiple Models
Agentic RAG with Web Search
Combine RAG with web search for up-to-date information. Fromrag_apps/agentic_rag_with_web_search/:
LlamaIndex RAG
LlamaIndex provides powerful document processing and indexing. Fromrag_apps/llamaIndex_starter/:
Advanced LlamaIndex Features
PDF RAG with OCR
Process PDFs with images, charts, and complex layouts.Gemma OCR RAG
Fromrag_apps/gemma_ocr/:
Chat with Code
RAG optimized for code repositories. Fromrag_apps/chat_with_code/:
Streamlit RAG UI
Build interactive RAG applications with Streamlit.Best Practices
1. Chunking Strategy
2. Metadata Enrichment
3. Hybrid Search
4. Re-ranking
Real-World Examples
MCP Documentation RAG
Location:rag_apps/agentic_rag/
RAG over MCP documentation with Arize Phoenix observability.
Resume Optimizer
Location:rag_apps/resume_optimizer/
RAG for resume optimization against job descriptions.
Conference CFP Generator
Location:advance_ai_agents/conference_agnositc_cfp_generator/
RAG with vector search over conference data for CFP generation.
Vector Database Comparison
| Database | Best For | Pros | Cons |
|---|---|---|---|
| LanceDB | Local dev, prototypes | Embedded, no server, fast | Limited scale |
| Qdrant | Production, scale | Fast, scalable, open source | Requires hosting |
| Pinecone | Managed cloud | Fully managed, easy | Cost, vendor lock-in |
| Weaviate | ML features | Rich features, GraphQL | Complex setup |
| ChromaDB | Simplicity | Easy API, local/cloud | Less mature |
Next Steps
Memory Systems
Add persistent memory to RAG agents
MCP Integration
Connect RAG to external data sources
Multi-Agent Patterns
Combine RAG with multi-agent workflows
Best Practices
Production RAG patterns and optimization