Overview
Embeddings transform text into high-dimensional vectors:- Semantic Search: Find similar content by meaning, not just keywords
- Document Retrieval: Retrieve relevant documents for RAG
- Clustering: Group similar documents together
- Similarity Comparison: Measure how similar two texts are
Supported Embedding Providers
Flowise supports multiple embedding providers:OpenAI Embeddings
Industry-standard embeddings with excellent performance: Available Models:text-embedding-3-large: 3072 dimensions, best qualitytext-embedding-3-small: 1536 dimensions, faster and cheapertext-embedding-ada-002: 1536 dimensions, previous generation
text-embedding-3-large: $0.13 per 1M tokenstext-embedding-3-small: $0.02 per 1M tokenstext-embedding-ada-002: $0.10 per 1M tokens
Azure OpenAI Embeddings
OpenAI embeddings through Azure:- Same models as OpenAI
- Enterprise security and compliance
- Private network deployment
- Regional data residency
Cohere Embeddings
Multilingual embeddings with strong performance: Available Models:embed-english-v3.0: English optimizedembed-multilingual-v3.0: 100+ languagesembed-english-light-v3.0: Faster, lower cost
- Compression support
- Input type specification (search_document, search_query)
- Fine-tuning capabilities
HuggingFace Embeddings
Open-source embeddings for privacy and cost control: Popular Models:sentence-transformers/all-MiniLM-L6-v2: 384 dimensions, fastsentence-transformers/all-mpnet-base-v2: 768 dimensions, high qualityintfloat/e5-large-v2: 1024 dimensions, state-of-the-art
- HuggingFace Inference API (cloud)
- Self-hosted (via HuggingFace Inference Endpoints)
- Local execution
Google Vertex AI Embeddings
Google Cloud’s embedding service: Available Models:textembedding-gecko@003: 768 dimensionstextembedding-gecko-multilingual@001: Multilingual supporttext-embedding-preview-0409: Preview models
- Google Cloud integration
- Enterprise-grade SLAs
- Global deployment
Ollama Embeddings
Run embeddings locally with Ollama: Available Models:nomic-embed-text: 768 dimensions, high qualitymxbai-embed-large: 1024 dimensionsall-minilm: 384 dimensions, lightweight
- Complete privacy (local execution)
- No API costs
- Offline capability
- Custom model support
Additional Providers
- Mistral AI: European AI with competitive pricing
- Voyage AI: Optimized for retrieval tasks
- Jina AI: Multimodal embeddings (text + images)
- Together AI: Multiple open-source models
- AWS Bedrock: Amazon’s embedding service
- IBM Watsonx: Enterprise AI platform
Using Embeddings in Flowise
In Document Stores
Embeddings are essential for document stores:In Chatflows
Add embeddings directly to your chatflow:- Drag an Embeddings node onto the canvas
- Configure the embeddings provider
- Connect to a Vector Store node
- Connect document loaders to the vector store
Configuring Embedding Providers
OpenAI Embeddings
text-embedding-3-small for most casestext-embedding-3-large for best qualitytext-embedding-ada-002 for compatibilityCohere Embeddings
HuggingFace Embeddings
Ollama Embeddings
Make sure Ollama is running locally and the model is downloaded:
Embedding Dimensions
Different models produce different vector dimensions:| Provider | Model | Dimensions | Use Case |
|---|---|---|---|
| OpenAI | text-embedding-3-small | 1536 | General purpose |
| OpenAI | text-embedding-3-large | 3072 | High accuracy |
| Cohere | embed-english-v3.0 | 1024 | English content |
| HuggingFace | all-MiniLM-L6-v2 | 384 | Fast retrieval |
| HuggingFace | all-mpnet-base-v2 | 768 | Quality balance |
| Ollama | nomic-embed-text | 768 | Local deployment |
| textembedding-gecko | 768 | GCP integration |
Dimension Reduction
Some providers support dimension reduction:- Lower storage costs
- Faster similarity search
- Reduced memory usage
- Slightly lower accuracy
- Cannot increase later
Best Practices
Choosing an Embedding Model
Consider these factors: For Production:- OpenAI
text-embedding-3-small: Best balance of cost/performance - Cohere
embed-english-v3.0: Great for English content - Azure OpenAI: Enterprise requirements
- HuggingFace models: No API costs
- Ollama: Local testing
- Ollama: Complete data privacy
- Self-hosted HuggingFace: Control your infrastructure
- Cohere
embed-multilingual-v3.0: 100+ languages - Google Vertex AI: Strong multilingual support
Input Optimization
Text Preprocessing:Cost Optimization
Batch Processing: Embed multiple texts at once:Embedding Quality
Testing Embeddings
Test semantic similarity:Evaluation Metrics
Retrieval Quality:- Precision@K: Relevant docs in top K results
- Recall@K: Proportion of relevant docs retrieved
- MRR: Mean reciprocal rank of first relevant result
- NDCG: Normalized discounted cumulative gain
Troubleshooting
Embeddings Fail to Generate
- Verify API key is valid
- Check rate limits
- Ensure text is not empty
- Review text length (max tokens)
Poor Retrieval Quality
- Try different embedding models
- Adjust chunk size and overlap
- Improve text preprocessing
- Add metadata for filtering
High Costs
- Switch to smaller models
- Enable caching
- Increase batch size
- Consider self-hosted options
Dimension Mismatch
- Ensure all vectors use same model
- Verify dimension configuration
- Recreate vector store if changed
