Overview
Generate embeddings for text using any of LiteLLM’s supported embedding providers. Returns responses in OpenAI format.
Function Signature
def embedding (
model : str ,
input : Union[ str , List[ str ]] = [],
# Optional params
dimensions : Optional[ int ] = None ,
encoding_format : Optional[ str ] = None ,
timeout : float = 600 ,
# API configuration
api_base : Optional[ str ] = None ,
api_version : Optional[ str ] = None ,
api_key : Optional[ str ] = None ,
api_type : Optional[ str ] = None ,
# LiteLLM specific
caching : bool = False ,
user : Optional[ str ] = None ,
custom_llm_provider : Optional[ str ] = None ,
** kwargs
) -> EmbeddingResponse
Parameters
Required Parameters
The embedding model to use. Examples:
text-embedding-3-small (OpenAI)
text-embedding-ada-002 (OpenAI)
amazon.titan-embed-text-v1 (Bedrock)
textembedding-gecko@003 (Vertex AI)
embed-english-v3.0 (Cohere)
input
Union[str, List[str]]
required
Input text to embed. Can be a single string or array of strings. # Single string
input = "The quick brown fox"
# Multiple strings
input = [ "First text" , "Second text" , "Third text" ]
Optional Parameters
Number of dimensions for the output embeddings. Only supported by some models (e.g., text-embedding-3 and later). dimensions = 512 # Reduce from default 1536
Format to return embeddings in. Options:
"float": Array of floats
"base64": Base64 encoded string
Request timeout in seconds (default 10 minutes).
Unique identifier for your end-user, for abuse monitoring.
API Configuration
API key for the provider. If not provided, uses environment variables.
Base URL for the API endpoint.
API version to use (provider-specific).
API type (e.g., “azure” for Azure OpenAI).
LiteLLM Specific
Override the provider detection. Example: custom_llm_provider="bedrock"
Additional metadata to tag the request.
Response
EmbeddingResponse
Object type, always “list”.
List of embedding objects. Object type, always “embedding”.
The embedding vector as an array of floats.
Index of the embedding in the list.
Model used for embeddings.
Token usage information. Number of tokens in the input.
Usage Examples
Basic Embedding
import litellm
response = litellm.embedding(
model = "text-embedding-3-small" ,
input = "The quick brown fox jumps over the lazy dog"
)
print (response.data[ 0 ].embedding) # [0.123, -0.456, ...]
print ( len (response.data[ 0 ].embedding)) # 1536
Batch Embeddings
import litellm
texts = [
"First document to embed" ,
"Second document to embed" ,
"Third document to embed"
]
response = litellm.embedding(
model = "text-embedding-3-small" ,
input = texts
)
for i, embedding_obj in enumerate (response.data):
print ( f "Document { i } : { len (embedding_obj.embedding) } dimensions" )
Async Embeddings
import litellm
import asyncio
async def main ():
response = await litellm.aembedding(
model = "text-embedding-3-small" ,
input = "Async embedding example"
)
print (response.data[ 0 ].embedding)
asyncio.run(main())
Custom Dimensions
import litellm
# Reduce embedding dimensions for smaller storage
response = litellm.embedding(
model = "text-embedding-3-small" ,
input = "Sample text" ,
dimensions = 512 # Instead of default 1536
)
print ( len (response.data[ 0 ].embedding)) # 512
Multiple Providers
import litellm
# OpenAI
response = litellm.embedding(
model = "text-embedding-3-small" ,
input = "Hello world"
)
# Cohere
response = litellm.embedding(
model = "embed-english-v3.0" ,
input = "Hello world"
)
# AWS Bedrock
response = litellm.embedding(
model = "amazon.titan-embed-text-v1" ,
input = "Hello world"
)
# Azure OpenAI
response = litellm.embedding(
model = "azure/text-embedding-ada-002" ,
input = "Hello world" ,
api_key = "your-azure-key" ,
api_base = "https://your-endpoint.openai.azure.com/" ,
api_version = "2024-02-01"
)
# Vertex AI
response = litellm.embedding(
model = "textembedding-gecko@003" ,
input = "Hello world"
)
Semantic Search Example
import litellm
import numpy as np
def cosine_similarity ( a , b ):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
# Documents to search
documents = [
"Python is a programming language" ,
"JavaScript is used for web development" ,
"Machine learning is a subset of AI"
]
# Get embeddings for all documents
response = litellm.embedding(
model = "text-embedding-3-small" ,
input = documents
)
doc_embeddings = [item.embedding for item in response.data]
# Query
query = "What is Python?"
query_response = litellm.embedding(
model = "text-embedding-3-small" ,
input = query
)
query_embedding = query_response.data[ 0 ].embedding
# Find most similar document
similarities = [
cosine_similarity(query_embedding, doc_emb)
for doc_emb in doc_embeddings
]
most_similar_idx = np.argmax(similarities)
print ( f "Most similar: { documents[most_similar_idx] } " )
print ( f "Similarity: { similarities[most_similar_idx] :.4f} " )
Provider-Specific Examples
import litellm
response = litellm.embedding(
model = "embed-english-v3.0" ,
input = "Sample text" ,
input_type = "search_document" # or "search_query", "classification"
)
Vertex AI Multimodal Embeddings
import litellm
response = litellm.embedding(
model = "multimodalembedding@001" ,
input = "Sample text"
)
Error Handling
import litellm
from litellm import AuthenticationError, RateLimitError
try :
response = litellm.embedding(
model = "text-embedding-3-small" ,
input = "Sample text"
)
except AuthenticationError as e:
print ( f "Authentication failed: { e } " )
except RateLimitError as e:
print ( f "Rate limit exceeded: { e } " )
except Exception as e:
print ( f "An error occurred: { e } " )
Supported Providers
LiteLLM supports embeddings from:
OpenAI : text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002
Azure OpenAI : All OpenAI embedding models
Cohere : embed-english-v3.0, embed-multilingual-v3.0
AWS Bedrock : amazon.titan-embed-text-v1, cohere.embed-*
Google Vertex AI : textembedding-gecko, text-embedding-004
Hugging Face : All embedding models
Voyage AI : voyage-2, voyage-code-2
Together AI : togethercomputer/m2-bert-80M-*
And many more!
See Embedding Providers for the complete list.