Embeddings - Flowise

Embeddings convert text into vector representations that capture semantic meaning. Flowise supports embeddings from major providers and local models.

Cloud Embedding Providers

OpenAI

text-embedding-3-small, text-embedding-3-large, ada-002

Azure OpenAI

OpenAI embeddings via Azure

Cohere

embed-english, embed-multilingual models

Google Gemini

Gemini embedding models

Google Vertex AI

Enterprise embedding models

AWS Bedrock

Titan and Cohere embeddings

Voyage AI

Specialized domain embeddings

Jina AI

Multilingual embedding models

Mistral AI

mistral-embed model

Self-Hosted & Local Embeddings

Ollama

Run embedding models locally

HuggingFace Inference

Access thousands of embedding models

LocalAI

Self-hosted embedding API

Together AI

Open-source embedding models

IBM Watsonx

Enterprise embedding models

Configuration Examples

OpenAI Embeddings

// OpenAI embedding configuration
{
  modelName: "text-embedding-3-small",
  stripNewLines: true,
  batchSize: 512,
  dimensions: 1536
}

Available Models:

text-embedding-3-small (1536 dims) - Best value
text-embedding-3-large (3072 dims) - Highest quality
text-embedding-ada-002 (1536 dims) - Legacy

Credential Setup:

Get API key from platform.openai.com
Add credential in Flowise:
- Credential Type: OpenAI API
- API Key: sk-...

Code Example:

// From OpenAIEmbedding.ts
const obj: Partial<OpenAIEmbeddingsParams> = {
  openAIApiKey,
  modelName: "text-embedding-3-small"
}

if (stripNewLines) obj.stripNewLines = stripNewLines
if (batchSize) obj.batchSize = parseInt(batchSize, 10)
if (dimensions) obj.dimensions = parseInt(dimensions, 10)

const model = new OpenAIEmbeddings(obj)

Azure OpenAI Embeddings

// Azure OpenAI configuration
{
  modelName: "text-embedding-ada-002",
  azureOpenAIApiDeploymentName: "my-embedding-deployment",
  azureOpenAIApiVersion: "2024-02-15-preview",
  batchSize: 512
}

Credential Setup:

Azure OpenAI API Key
Azure OpenAI API Instance Name
Azure OpenAI API Deployment Name
Azure OpenAI API Version

Cohere Embeddings

// Cohere configuration
{
  modelName: "embed-english-v3.0",
  inputType: "search_query" // or "search_document"
}

Available Models:

embed-english-v3.0 (1024 dims)
embed-multilingual-v3.0 (1024 dims)
embed-english-light-v3.0 (384 dims)

Input Types:

search_query - For queries
search_document - For documents being indexed
classification - For classification tasks
clustering - For clustering tasks

Get API key from cohere.ai

Ollama Embeddings (Local)

// Ollama configuration
{
  baseUrl: "http://localhost:11434",
  modelName: "nomic-embed-text"
}

Setup Steps:

Install Ollama:

curl -fsSL https://ollama.ai/install.sh | sh

Pull embedding model:
```
ollama pull nomic-embed-text
```
Available models:
- nomic-embed-text (768 dims) - Recommended
- mxbai-embed-large (1024 dims)
- all-minilm (384 dims)

Test locally:

curl http://localhost:11434/api/embeddings \
  -d '{"model": "nomic-embed-text", "prompt": "test"}'

HuggingFace Embeddings

// HuggingFace Inference configuration
{
  modelName: "sentence-transformers/all-MiniLM-L6-v2",
  apiKey: "hf_..."
}

Popular Models:

sentence-transformers/all-MiniLM-L6-v2 (384 dims)
sentence-transformers/all-mpnet-base-v2 (768 dims)
BAAI/bge-small-en-v1.5 (384 dims)
BAAI/bge-large-en-v1.5 (1024 dims)

Get API key from huggingface.co

Google Gemini Embeddings

// Google Generative AI configuration
{
  modelName: "embedding-001"
}

Get API key from makersuite.google.com

Voyage AI Embeddings

// Voyage AI configuration
{
  modelName: "voyage-2",
  inputType: "document" // or "query"
}

Available Models:

voyage-2 (1024 dims) - General purpose
voyage-code-2 (1536 dims) - Code embeddings
voyage-large-2 (1536 dims) - Highest quality

Mistral Embeddings

// Mistral AI configuration
{
  modelName: "mistral-embed"
}

Get API key from console.mistral.ai

Together AI Embeddings

// Together AI configuration
{
  modelName: "togethercomputer/m2-bert-80M-8k-retrieval"
}

Advanced Configuration

Custom Dimensions

OpenAI’s new models support custom dimensions:

// Reduce dimensions for storage efficiency
{
  modelName: "text-embedding-3-large",
  dimensions: 1024 // Original: 3072
}

Benefits:

Smaller storage requirements
Faster similarity search
Minimal quality loss

Batch Processing

Process multiple texts efficiently:

{
  batchSize: 512, // Process 512 texts per API call
  stripNewLines: true // Remove newlines before embedding
}

Custom Base URLs

Use custom endpoints:

// OpenAI-compatible endpoint
{
  basePath: "https://api.custom-endpoint.com/v1"
}

Timeout Configuration

{
  timeout: 60000 // 60 seconds
}

Embedding Dimensions Guide

Provider	Model	Dimensions	Use Case
OpenAI	text-embedding-3-small	1536	Balanced
OpenAI	text-embedding-3-large	3072	Best quality
Cohere	embed-english-v3.0	1024	General
Ollama	nomic-embed-text	768	Local
HuggingFace	all-MiniLM-L6-v2	384	Fast
Voyage	voyage-2	1024	Specialized
Mistral	mistral-embed	1024	Multilingual

Choosing an Embedding Model

Best for Quality

OpenAI text-embedding-3-large - Highest MTEB score
Voyage voyage-large-2 - Domain-specialized
Cohere embed-english-v3.0 - Strong retrieval

Best for Speed

OpenAI text-embedding-3-small - Fast and cheap
Ollama nomic-embed-text - Local, no latency
HuggingFace all-MiniLM-L6-v2 - Small model

Best for Cost

Ollama - Free, run locally
HuggingFace - Free inference API
OpenAI text-embedding-3-small - $0.02 per 1M tokens

Best for Privacy

Ollama - 100% local
LocalAI - Self-hosted
HuggingFace (self-hosted) - Deploy yourself

Best for Multilingual

Cohere embed-multilingual-v3.0 - 100+ languages
Mistral mistral-embed - Multilingual support
HuggingFace paraphrase-multilingual - 50+ languages

Performance Optimization

Batch Size Optimization

// Optimize based on input size
{
  batchSize: 512 // For short texts (< 500 chars)
  batchSize: 256 // For medium texts (500-2000 chars)
  batchSize: 128 // For long texts (> 2000 chars)
}

Caching Embeddings

Cache frequently used embeddings:

// Use cache to avoid re-computing
const cache = new LocalFileCache("embeddings-cache")
const embeddings = new CachedEmbeddings({
  underlyingEmbeddings: new OpenAIEmbeddings(),
  cache
})

Dimensionality Reduction

// Use reduced dimensions for large datasets
{
  modelName: "text-embedding-3-large",
  dimensions: 1024 // Reduced from 3072
}

Reduction preserves ~98% quality while:

Reducing storage by 67%
Speeding up search by 3x
Lowering costs

Code Examples

Basic Usage

// Embed a single query
const queryEmbedding = await embeddings.embedQuery(
  "What is machine learning?"
)

// Embed multiple documents
const docEmbeddings = await embeddings.embedDocuments([
  "Document 1 text",
  "Document 2 text",
  "Document 3 text"
])

With Vector Store

// Upsert documents with embeddings
await vectorStore.addDocuments([
  { pageContent: "text", metadata: { source: "doc.pdf" } }
])

// Search with query
const results = await vectorStore.similaritySearch(
  "search query",
  4 // top K
)

Custom Embedding Function

// Implement custom embeddings
class CustomEmbeddings extends Embeddings {
  async embedDocuments(texts: string[]): Promise<number[][]> {
    // Your custom logic
    return embeddings
  }
  
  async embedQuery(text: string): Promise<number[]> {
    // Your custom logic
    return embedding
  }
}

Troubleshooting

Dimension Mismatch

Error: Vector dimension mismatch
Expected: 1536, Got: 768

Solution: Ensure embedding model dimensions match vector store:

// Check your embedding dimensions
OpenAI text-embedding-3-small: 1536
Ollama nomic-embed-text: 768

// Update vector store index accordingly

Rate Limits

// Handle rate limits with retries
{
  timeout: 60000,
  maxRetries: 3,
  batchSize: 100 // Smaller batches
}

Connection Issues

# Test OpenAI embeddings
curl https://api.openai.com/v1/embeddings \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"input": "test", "model": "text-embedding-3-small"}'

# Test Ollama
curl http://localhost:11434/api/embeddings \
  -d '{"model": "nomic-embed-text", "prompt": "test"}'

Out of Memory

// Reduce batch size for large documents
{
  batchSize: 50 // Smaller batches
}

Best Practices

Match dimensions - Embedding and vector store must match
Batch processing - Use appropriate batch sizes
Cache embeddings - Avoid re-computing same texts
Choose right model - Balance quality vs. cost vs. speed
Test locally first - Use Ollama for development
Monitor costs - Track API usage and optimize

Get Started

Core Concepts

Building Workflows

Integrations

Features

Deployment

Development

​Cloud Embedding Providers

OpenAI

Azure OpenAI

Cohere

Google Gemini

Google Vertex AI

AWS Bedrock

Voyage AI

Jina AI

Mistral AI

​Self-Hosted & Local Embeddings

Ollama

HuggingFace Inference

LocalAI

Together AI

IBM Watsonx

​Configuration Examples

​OpenAI Embeddings

​Azure OpenAI Embeddings

​Cohere Embeddings

​Ollama Embeddings (Local)

​HuggingFace Embeddings

​Google Gemini Embeddings

​Voyage AI Embeddings

​Mistral Embeddings

​Together AI Embeddings

​Advanced Configuration

​Custom Dimensions

​Batch Processing

​Custom Base URLs

​Timeout Configuration

​Embedding Dimensions Guide

​Choosing an Embedding Model

​Best for Quality

​Best for Speed

​Best for Cost

​Best for Privacy

​Best for Multilingual

​Performance Optimization

​Batch Size Optimization

​Caching Embeddings

​Dimensionality Reduction

​Code Examples

​Basic Usage

​With Vector Store

​Custom Embedding Function

​Troubleshooting

​Dimension Mismatch

​Rate Limits

​Connection Issues

​Out of Memory

​Best Practices

​Next Steps

Vector Stores

Document Loaders

Build docs developers (and LLMs) love

Cloud Embedding Providers

Self-Hosted & Local Embeddings

Configuration Examples

OpenAI Embeddings

Azure OpenAI Embeddings

Cohere Embeddings

Ollama Embeddings (Local)

HuggingFace Embeddings

Google Gemini Embeddings

Voyage AI Embeddings

Mistral Embeddings

Together AI Embeddings

Advanced Configuration

Custom Dimensions

Batch Processing

Custom Base URLs

Timeout Configuration

Embedding Dimensions Guide

Choosing an Embedding Model

Best for Quality

Best for Speed

Best for Cost

Best for Privacy

Best for Multilingual

Performance Optimization

Batch Size Optimization

Caching Embeddings

Dimensionality Reduction

Code Examples

Basic Usage

With Vector Store

Custom Embedding Function

Troubleshooting

Dimension Mismatch

Rate Limits

Connection Issues

Out of Memory

Best Practices

Next Steps