Overview
Embeddings convert text into numerical vectors for similarity search, retrieval, and RAG applications. LiteLLM provides a unified interface for embeddings across OpenAI, Cohere, HuggingFace, and more.Quick Start
Basic Usage
- Single Text
- Multiple Texts
Providers
- OpenAI
- Cohere
- HuggingFace
- Ollama
- Azure OpenAI
Latest embedding models with high quality.
Dimensions Control
Some providers allow controlling output dimensions.Encoding Format
Batch Processing
Process large datasets efficiently.Similarity Search
RAG (Retrieval Augmented Generation)
Async Embeddings
Parallel Processing
Caching
Cache embeddings to reduce API calls.Usage Tracking
Error Handling
Model Comparison
| Model | Provider | Dimensions | Max Tokens | Use Case |
|---|---|---|---|---|
| text-embedding-3-small | OpenAI | 1536 | 8191 | General purpose, fast |
| text-embedding-3-large | OpenAI | 3072 | 8191 | High quality |
| embed-english-v3.0 | Cohere | 1024 | - | Search, classification |
| all-MiniLM-L6-v2 | HuggingFace | 384 | 256 | Fast, local |
| bge-large-en-v1.5 | HuggingFace | 1024 | 512 | High quality |
| nomic-embed-text | Ollama | 768 | - | Local, privacy |
Best Practices
Model Selection
Model Selection
- Use
text-embedding-3-smallfor most use cases - Use
text-embedding-3-largefor highest quality - Use Cohere for specialized search applications
- Use Ollama for privacy-sensitive applications
Performance
Performance
- Batch texts when possible (up to 100-2000 depending on provider)
- Use async for concurrent requests
- Cache embeddings for frequently used texts
- Consider using smaller dimensions if storage is a concern
Cost Optimization
Cost Optimization
- Use smaller models when quality difference is minimal
- Reduce dimensions to save storage and compute
- Cache embeddings to avoid re-computing
- Batch process to reduce API overhead
Quality
Quality
- Normalize text before embedding
- Keep consistent text format across corpus
- Use same model for queries and documents
- Test multiple models for your specific use case
Advanced Patterns
- Hybrid Search
- Embedding Store
Combine embeddings with keyword search.