Parameters
The identifier of the multimodal embedding model to use.Available models:
voyage-multimodal-3- Third generation multimodal model
Returns
A multimodal embedding model instance that implements the AI SDK’s
EmbeddingModelV3 interface.The model identifier passed during creation
The provider identifier:
"voyage.multimodal.embedding"Maximum number of inputs per API call:
128Whether parallel calls are supported:
falseInput types
The multimodal embedding model acceptsMultimodalEmbeddingInput which can be:
- Single text:
string- A single text string - Single image:
string- A single image URL or base64-encoded image - Multiple texts:
string[]- Array of texts combined into one embedding - Multiple images:
string[]- Array of images combined into one embedding - Multimodal content:
{ text?: string[], image?: string[] }- Mixed content with explicit structure - Object format (text):
{ text: string | string[] }- Alternative format for text - Object format (image):
{ image: string | string[] }- Alternative format for images - Pre-formatted content:
{ content: ContentItem[] }- Pre-formatted content items
Image formats
Images can be provided as:- URL:
https://example.com/image.jpg(must have image extension: .jpg, .jpeg, .png, .gif, .bmp, .webp, .svg) - Base64:
data:image/jpeg;base64,/9j/4AAQSkZJRg...(data URI with base64 encoding)
Content item structure
When using pre-formatted content, each item follows the structure:{ type: 'text', text: string }{ type: 'image_url', image_url: string }{ type: 'image_base64', image_base64: string }
Usage examples
Generate text and image embedding
Combine text and images into a single embedding vector.Generate multiple multimodal embeddings
Embed multiple multimodal inputs to generate separate embedding vectors.Use text only
The multimodal model can also be used with text-only inputs.Combine multiple texts in one embedding
You can combine multiple text strings into a single embedding.Use image only
The multimodal model can also be used with image-only inputs.Combine single text with multiple images
You can combine one text string with multiple images in a single embedding.Use object format for text
You can use the object format with explicit text property.Use object format for images
You can use the object format with explicit image property.Use pre-formatted content items
You can use pre-formatted content items with explicit type specifications.Use provider options
You can customize the embedding behavior using provider-specific options.Provider options
You can pass Voyage-specific options through theproviderOptions parameter:
The input type for the embeddings. Defaults to
"query".When specified, Voyage automatically prepends a prompt to your inputs before vectorizing them, creating vectors more tailored for retrieval/search tasks.query: Prepends “Represent the query for retrieving supporting documents: ”document: Prepends “Represent the document for retrieval: ”
The data type for the resulting output embeddings.If not specified (defaults to null), the embeddings are represented as a list of floating-point numbers.If
'base64', the embeddings are represented as a Base64-encoded NumPy array of single-precision floats.Whether to truncate the input to fit within the context length. Defaults to
true.