embedding()

Overview

Generate embeddings for text using any of LiteLLM’s supported embedding providers. Returns responses in OpenAI format.

Function Signature

def embedding(
    model: str,
    input: Union[str, List[str]] = [],
    # Optional params
    dimensions: Optional[int] = None,
    encoding_format: Optional[str] = None,
    timeout: float = 600,
    # API configuration
    api_base: Optional[str] = None,
    api_version: Optional[str] = None,
    api_key: Optional[str] = None,
    api_type: Optional[str] = None,
    # LiteLLM specific
    caching: bool = False,
    user: Optional[str] = None,
    custom_llm_provider: Optional[str] = None,
    **kwargs
) -> EmbeddingResponse

Parameters

Required Parameters

model

string

required

The embedding model to use.Examples:

text-embedding-3-small (OpenAI)
text-embedding-ada-002 (OpenAI)
amazon.titan-embed-text-v1 (Bedrock)
textembedding-gecko@003 (Vertex AI)
embed-english-v3.0 (Cohere)

input

Union[str, List[str]]

required

Input text to embed. Can be a single string or array of strings.

# Single string
input="The quick brown fox"

# Multiple strings
input=["First text", "Second text", "Third text"]

Optional Parameters

dimensions

int

Number of dimensions for the output embeddings. Only supported by some models (e.g., text-embedding-3 and later).

dimensions=512  # Reduce from default 1536

encoding_format

string

default:"float"

Format to return embeddings in.Options:

"float": Array of floats
"base64": Base64 encoded string

timeout

float

default:"600"

Request timeout in seconds (default 10 minutes).

user

string

Unique identifier for your end-user, for abuse monitoring.

API Configuration

api_key

string

API key for the provider. If not provided, uses environment variables.

api_base

string

Base URL for the API endpoint.

api_version

string

API version to use (provider-specific).

api_type

string

API type (e.g., “azure” for Azure OpenAI).

LiteLLM Specific

caching

bool

default:"false"

Enable response caching.

custom_llm_provider

string

Override the provider detection.Example: custom_llm_provider="bedrock"

metadata

dict

Additional metadata to tag the request.

Response

EmbeddingResponse

object

string

Object type, always “list”.

data

List[Embedding]

List of embedding objects.

Show Embedding object

object

string

Object type, always “embedding”.

embedding

List[float]

The embedding vector as an array of floats.

index

int

Index of the embedding in the list.

model

string

Model used for embeddings.

usage

Usage

Token usage information.

Show Usage object

prompt_tokens

int

Number of tokens in the input.

total_tokens

int

Total tokens used.

Usage Examples

Basic Embedding

import litellm

response = litellm.embedding(
    model="text-embedding-3-small",
    input="The quick brown fox jumps over the lazy dog"
)

print(response.data[0].embedding)  # [0.123, -0.456, ...]
print(len(response.data[0].embedding))  # 1536

Batch Embeddings

import litellm

texts = [
    "First document to embed",
    "Second document to embed",
    "Third document to embed"
]

response = litellm.embedding(
    model="text-embedding-3-small",
    input=texts
)

for i, embedding_obj in enumerate(response.data):
    print(f"Document {i}: {len(embedding_obj.embedding)} dimensions")

Async Embeddings

import litellm
import asyncio

async def main():
    response = await litellm.aembedding(
        model="text-embedding-3-small",
        input="Async embedding example"
    )
    print(response.data[0].embedding)

asyncio.run(main())

Custom Dimensions

import litellm

# Reduce embedding dimensions for smaller storage
response = litellm.embedding(
    model="text-embedding-3-small",
    input="Sample text",
    dimensions=512  # Instead of default 1536
)

print(len(response.data[0].embedding))  # 512

Multiple Providers

import litellm

# OpenAI
response = litellm.embedding(
    model="text-embedding-3-small",
    input="Hello world"
)

# Cohere
response = litellm.embedding(
    model="embed-english-v3.0",
    input="Hello world"
)

# AWS Bedrock
response = litellm.embedding(
    model="amazon.titan-embed-text-v1",
    input="Hello world"
)

# Azure OpenAI
response = litellm.embedding(
    model="azure/text-embedding-ada-002",
    input="Hello world",
    api_key="your-azure-key",
    api_base="https://your-endpoint.openai.azure.com/",
    api_version="2024-02-01"
)

# Vertex AI
response = litellm.embedding(
    model="textembedding-gecko@003",
    input="Hello world"
)

Semantic Search Example

import litellm
import numpy as np

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Documents to search
documents = [
    "Python is a programming language",
    "JavaScript is used for web development",
    "Machine learning is a subset of AI"
]

# Get embeddings for all documents
response = litellm.embedding(
    model="text-embedding-3-small",
    input=documents
)

doc_embeddings = [item.embedding for item in response.data]

# Query
query = "What is Python?"
query_response = litellm.embedding(
    model="text-embedding-3-small",
    input=query
)
query_embedding = query_response.data[0].embedding

# Find most similar document
similarities = [
    cosine_similarity(query_embedding, doc_emb) 
    for doc_emb in doc_embeddings
]

most_similar_idx = np.argmax(similarities)
print(f"Most similar: {documents[most_similar_idx]}")
print(f"Similarity: {similarities[most_similar_idx]:.4f}")

Provider-Specific Examples

Cohere with Input Type

import litellm

response = litellm.embedding(
    model="embed-english-v3.0",
    input="Sample text",
    input_type="search_document"  # or "search_query", "classification"
)

Vertex AI Multimodal Embeddings

import litellm

response = litellm.embedding(
    model="multimodalembedding@001",
    input="Sample text"
)

Error Handling

import litellm
from litellm import AuthenticationError, RateLimitError

try:
    response = litellm.embedding(
        model="text-embedding-3-small",
        input="Sample text"
    )
except AuthenticationError as e:
    print(f"Authentication failed: {e}")
except RateLimitError as e:
    print(f"Rate limit exceeded: {e}")
except Exception as e:
    print(f"An error occurred: {e}")

Supported Providers

LiteLLM supports embeddings from:

OpenAI: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002
Azure OpenAI: All OpenAI embedding models
Cohere: embed-english-v3.0, embed-multilingual-v3.0
AWS Bedrock: amazon.titan-embed-text-v1, cohere.embed-*
Google Vertex AI: textembedding-gecko, text-embedding-004
Hugging Face: All embedding models
Voyage AI: voyage-2, voyage-code-2
Together AI: togethercomputer/m2-bert-80M-*
And many more!

See Embedding Providers for the complete list.

aembedding() - Async version
Router.embedding() - Load balanced embeddings
Completion API
Supported Embedding Providers

SDK Reference

Proxy Endpoints

Configuration

Overview

Function Signature

Parameters

Required Parameters

Optional Parameters

API Configuration

LiteLLM Specific

Response

EmbeddingResponse

Usage Examples

Basic Embedding

Batch Embeddings

Async Embeddings

Custom Dimensions

Multiple Providers

Semantic Search Example

Provider-Specific Examples

Cohere with Input Type

Vertex AI Multimodal Embeddings

Error Handling

Supported Providers

Build docs developers (and LLMs) love

SDK Reference

Proxy Endpoints

Configuration

​Overview

​Function Signature

​Parameters

​Required Parameters

​Optional Parameters

​API Configuration

​LiteLLM Specific

​Response

​EmbeddingResponse

​Usage Examples

​Basic Embedding

​Batch Embeddings

​Async Embeddings

​Custom Dimensions

​Multiple Providers

​Semantic Search Example

​Provider-Specific Examples

​Cohere with Input Type

​Vertex AI Multimodal Embeddings

​Error Handling

​Supported Providers

​Related

Build docs developers (and LLMs) love

Overview

Function Signature

Parameters

Required Parameters

Optional Parameters

API Configuration

LiteLLM Specific

Response

EmbeddingResponse

Usage Examples

Basic Embedding

Batch Embeddings

Async Embeddings

Custom Dimensions

Multiple Providers

Semantic Search Example

Provider-Specific Examples

Cohere with Input Type

Vertex AI Multimodal Embeddings

Error Handling

Supported Providers

Related