Skip to main content

Overview

Generate embeddings for text using any of LiteLLM’s supported embedding providers. Returns responses in OpenAI format.

Function Signature

def embedding(
    model: str,
    input: Union[str, List[str]] = [],
    # Optional params
    dimensions: Optional[int] = None,
    encoding_format: Optional[str] = None,
    timeout: float = 600,
    # API configuration
    api_base: Optional[str] = None,
    api_version: Optional[str] = None,
    api_key: Optional[str] = None,
    api_type: Optional[str] = None,
    # LiteLLM specific
    caching: bool = False,
    user: Optional[str] = None,
    custom_llm_provider: Optional[str] = None,
    **kwargs
) -> EmbeddingResponse

Parameters

Required Parameters

model
string
required
The embedding model to use.Examples:
  • text-embedding-3-small (OpenAI)
  • text-embedding-ada-002 (OpenAI)
  • amazon.titan-embed-text-v1 (Bedrock)
  • textembedding-gecko@003 (Vertex AI)
  • embed-english-v3.0 (Cohere)
input
Union[str, List[str]]
required
Input text to embed. Can be a single string or array of strings.
# Single string
input="The quick brown fox"

# Multiple strings
input=["First text", "Second text", "Third text"]

Optional Parameters

dimensions
int
Number of dimensions for the output embeddings. Only supported by some models (e.g., text-embedding-3 and later).
dimensions=512  # Reduce from default 1536
encoding_format
string
default:"float"
Format to return embeddings in.Options:
  • "float": Array of floats
  • "base64": Base64 encoded string
timeout
float
default:"600"
Request timeout in seconds (default 10 minutes).
user
string
Unique identifier for your end-user, for abuse monitoring.

API Configuration

api_key
string
API key for the provider. If not provided, uses environment variables.
api_base
string
Base URL for the API endpoint.
api_version
string
API version to use (provider-specific).
api_type
string
API type (e.g., “azure” for Azure OpenAI).

LiteLLM Specific

caching
bool
default:"false"
Enable response caching.
custom_llm_provider
string
Override the provider detection.Example: custom_llm_provider="bedrock"
metadata
dict
Additional metadata to tag the request.

Response

EmbeddingResponse

object
string
Object type, always “list”.
data
List[Embedding]
List of embedding objects.
model
string
Model used for embeddings.
usage
Usage
Token usage information.

Usage Examples

Basic Embedding

import litellm

response = litellm.embedding(
    model="text-embedding-3-small",
    input="The quick brown fox jumps over the lazy dog"
)

print(response.data[0].embedding)  # [0.123, -0.456, ...]
print(len(response.data[0].embedding))  # 1536

Batch Embeddings

import litellm

texts = [
    "First document to embed",
    "Second document to embed",
    "Third document to embed"
]

response = litellm.embedding(
    model="text-embedding-3-small",
    input=texts
)

for i, embedding_obj in enumerate(response.data):
    print(f"Document {i}: {len(embedding_obj.embedding)} dimensions")

Async Embeddings

import litellm
import asyncio

async def main():
    response = await litellm.aembedding(
        model="text-embedding-3-small",
        input="Async embedding example"
    )
    print(response.data[0].embedding)

asyncio.run(main())

Custom Dimensions

import litellm

# Reduce embedding dimensions for smaller storage
response = litellm.embedding(
    model="text-embedding-3-small",
    input="Sample text",
    dimensions=512  # Instead of default 1536
)

print(len(response.data[0].embedding))  # 512

Multiple Providers

import litellm

# OpenAI
response = litellm.embedding(
    model="text-embedding-3-small",
    input="Hello world"
)

# Cohere
response = litellm.embedding(
    model="embed-english-v3.0",
    input="Hello world"
)

# AWS Bedrock
response = litellm.embedding(
    model="amazon.titan-embed-text-v1",
    input="Hello world"
)

# Azure OpenAI
response = litellm.embedding(
    model="azure/text-embedding-ada-002",
    input="Hello world",
    api_key="your-azure-key",
    api_base="https://your-endpoint.openai.azure.com/",
    api_version="2024-02-01"
)

# Vertex AI
response = litellm.embedding(
    model="textembedding-gecko@003",
    input="Hello world"
)

Semantic Search Example

import litellm
import numpy as np

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Documents to search
documents = [
    "Python is a programming language",
    "JavaScript is used for web development",
    "Machine learning is a subset of AI"
]

# Get embeddings for all documents
response = litellm.embedding(
    model="text-embedding-3-small",
    input=documents
)

doc_embeddings = [item.embedding for item in response.data]

# Query
query = "What is Python?"
query_response = litellm.embedding(
    model="text-embedding-3-small",
    input=query
)
query_embedding = query_response.data[0].embedding

# Find most similar document
similarities = [
    cosine_similarity(query_embedding, doc_emb) 
    for doc_emb in doc_embeddings
]

most_similar_idx = np.argmax(similarities)
print(f"Most similar: {documents[most_similar_idx]}")
print(f"Similarity: {similarities[most_similar_idx]:.4f}")

Provider-Specific Examples

Cohere with Input Type

import litellm

response = litellm.embedding(
    model="embed-english-v3.0",
    input="Sample text",
    input_type="search_document"  # or "search_query", "classification"
)

Vertex AI Multimodal Embeddings

import litellm

response = litellm.embedding(
    model="multimodalembedding@001",
    input="Sample text"
)

Error Handling

import litellm
from litellm import AuthenticationError, RateLimitError

try:
    response = litellm.embedding(
        model="text-embedding-3-small",
        input="Sample text"
    )
except AuthenticationError as e:
    print(f"Authentication failed: {e}")
except RateLimitError as e:
    print(f"Rate limit exceeded: {e}")
except Exception as e:
    print(f"An error occurred: {e}")

Supported Providers

LiteLLM supports embeddings from:
  • OpenAI: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002
  • Azure OpenAI: All OpenAI embedding models
  • Cohere: embed-english-v3.0, embed-multilingual-v3.0
  • AWS Bedrock: amazon.titan-embed-text-v1, cohere.embed-*
  • Google Vertex AI: textembedding-gecko, text-embedding-004
  • Hugging Face: All embedding models
  • Voyage AI: voyage-2, voyage-code-2
  • Together AI: togethercomputer/m2-bert-80M-*
  • And many more!
See Embedding Providers for the complete list.

Build docs developers (and LLMs) love