Skip to main content

Method Signature

client.embeddings.create(
    input: Union[str, List[str], List[int], List[List[int]]],
    model: Union[str, EmbeddingModel],
    dimensions: Optional[int] = None,
    encoding_format: Literal["float", "base64"] = "base64",
    user: Optional[str] = None
) -> CreateEmbeddingResponse

Parameters

input
Union[str, List[str], List[int], List[List[int]]]
required
Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays.Constraints:
  • Must not exceed the max input tokens for the model (8192 tokens for all embedding models)
  • Cannot be an empty string
  • Any array must be 2048 dimensions or less
  • Maximum of 300,000 tokens summed across all inputs in a single request
model
Union[str, EmbeddingModel]
required
ID of the model to use. You can use the List models API to see all available models, or see the Model overview for descriptions.Popular models:
  • text-embedding-3-small
  • text-embedding-3-large
  • text-embedding-ada-002
dimensions
int
The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models.
encoding_format
Literal['float', 'base64']
default:"base64"
The format to return the embeddings in. Can be either float or base64.Note: The SDK automatically decodes base64 embeddings to float arrays for convenience.
user
str
A unique identifier representing your end-user, which can help OpenAI monitor and detect abuse. Learn more.

Response

Returns a CreateEmbeddingResponse object:
class CreateEmbeddingResponse(BaseModel):
    data: List[Embedding]  # The list of embeddings generated by the model
    model: str  # The name of the model used to generate the embedding
    object: Literal["list"]  # Always "list"
    usage: Usage  # Token usage information

class Embedding(BaseModel):
    embedding: List[float]  # The embedding vector
    index: int  # The index of the embedding in the list
    object: Literal["embedding"]  # Always "embedding"

class Usage(BaseModel):
    prompt_tokens: int  # The number of tokens used by the prompt
    total_tokens: int  # The total number of tokens used

Examples

Single Text Embedding

from openai import OpenAI

client = OpenAI()

response = client.embeddings.create(
    input="The quick brown fox jumps over the lazy dog",
    model="text-embedding-3-small"
)

print(response.data[0].embedding)
print(f"Tokens used: {response.usage.total_tokens}")

Multiple Texts in One Request

response = client.embeddings.create(
    input=[
        "First document to embed",
        "Second document to embed",
        "Third document to embed"
    ],
    model="text-embedding-3-small"
)

for i, embedding_obj in enumerate(response.data):
    print(f"Embedding {i}: {len(embedding_obj.embedding)} dimensions")

Using Custom Dimensions

response = client.embeddings.create(
    input="Text to embed",
    model="text-embedding-3-large",
    dimensions=256  # Reduce from default 3072 to 256
)

print(f"Embedding dimensions: {len(response.data[0].embedding)}")

Embedding Tokens Directly

import tiktoken

encoding = tiktoken.get_encoding("cl100k_base")
tokens = encoding.encode("Text to embed")

response = client.embeddings.create(
    input=tokens,
    model="text-embedding-3-small"
)

Async Usage

from openai import AsyncOpenAI

client = AsyncOpenAI()

response = await client.embeddings.create(
    input="Text to embed",
    model="text-embedding-3-small"
)

Notes

  • The Python SDK automatically optimizes embedding encoding by using base64 format by default and decoding it to floats
  • If NumPy is installed, the SDK uses it for faster base64 decoding
  • See the Embeddings Guide for best practices and use cases
  • Use tiktoken to count tokens before sending requests

Build docs developers (and LLMs) love