Overview
Portkey enhances LlamaIndex applications with:- Multi-Provider Support: Route to 250+ LLMs seamlessly
- Reliability: Automatic fallbacks and retries
- Performance: Smart caching for embeddings and completions
- Observability: Full logging and tracing for RAG pipelines
- Cost Optimization: Track and reduce token usage
Installation
Quick Start
LlamaIndex works with Portkey through the OpenAI-compatible interface:Complete RAG Setup
Build a complete RAG application with Portkey:Using Different Providers
Switch between providers easily:Advanced Routing
Fallback Configuration
Automatically fallback to backup providers:Load Balancing
Distribute traffic across multiple models:Caching for Embeddings
Cache embeddings to reduce costs and improve performance:Chat Engine with Portkey
Build conversational applications:Streaming Responses
Enable streaming for real-time responses:Multi-Document Agents
Build agents that reason over multiple documents:Observability and Monitoring
Track your RAG pipeline performance:- Query latency
- Token usage per query
- Cache hit rates
- Error rates
- Cost per query
Best Practices
Cache Embeddings
Cache Embeddings
Enable caching for embeddings to avoid recomputing them for the same content.
Use Fallbacks for Production
Use Fallbacks for Production
Always configure fallback providers for your RAG pipeline:
Track Query Performance
Track Query Performance
Add metadata to track which queries are slow or expensive:
Optimize Chunk Sizes
Optimize Chunk Sizes
Monitor token usage to optimize your chunking strategy and reduce costs.
Error Handling
Implement robust error handling:Resources
Need help? Join our Discord community for support.