Skip to main content
Image embeddings convert visual content into dense vector representations that capture semantic information. You can use Voyage AI’s multimodal models to power visual search, image classification, and content-based recommendations.

Overview

Voyage AI’s voyage-multimodal-3 model supports generating embeddings from:
  • Single images
  • Multiple images combined into one embedding
  • Images in various formats (URLs and base64)
Image embeddings use the same multimodal model as text embeddings. Use imageEmbeddingModel() to create a model instance specifically for image inputs.

Basic usage

Generate an embedding for a single image:
import { createVoyage } from 'voyage-ai-provider';
import { embed } from 'ai';

const voyage = createVoyage({
  apiKey: process.env.VOYAGE_API_KEY,
});

const { embedding } = await embed({
  model: voyage.imageEmbeddingModel('voyage-multimodal-3'),
  value: 'https://i.ibb.co/r5w8hG8/beach2.jpg',
});

console.log(`Generated embedding with ${embedding.length} dimensions`);

Image formats

You can provide images in two formats:

Using image URLs

Provide direct URLs to publicly accessible images:
import { createVoyage } from 'voyage-ai-provider';
import { embedMany } from 'ai';

const voyage = createVoyage({
  apiKey: process.env.VOYAGE_API_KEY,
});

const { embeddings } = await embedMany({
  model: voyage.imageEmbeddingModel('voyage-multimodal-3'),
  values: [
    'https://i.ibb.co/nQNGqL0/beach1.jpg',
    'https://i.ibb.co/r5w8hG8/beach2.jpg',
  ],
});

console.log(`Generated ${embeddings.length} embeddings`);
Using URLs is simpler and reduces payload size, but requires images to be publicly accessible.

Multiple images per embedding

Combine multiple images into a single embedding to represent related visual content:
import { createVoyage } from 'voyage-ai-provider';
import { embedMany } from 'ai';
import type { ImageEmbeddingInput } from 'voyage-ai-provider';

const voyage = createVoyage({
  apiKey: process.env.VOYAGE_API_KEY,
});

const { embeddings } = await embedMany<ImageEmbeddingInput>({
  model: voyage.imageEmbeddingModel('voyage-multimodal-3'),
  values: [
    {
      image: [
        'https://i.ibb.co/nQNGqL0/beach1.jpg',
        'https://i.ibb.co/r5w8hG8/beach2.jpg',
      ],
    },
  ],
});

console.log(`Combined ${embeddings[0].length} dimension embedding`);
Combining multiple images creates a single embedding that represents all images together, useful for collections or multi-view representations.

Batch processing

Generate embeddings for multiple images efficiently:
import { createVoyage } from 'voyage-ai-provider';
import { embedMany } from 'ai';
import type { ImageEmbeddingInput } from 'voyage-ai-provider';

const voyage = createVoyage({
  apiKey: process.env.VOYAGE_API_KEY,
});

const { embeddings } = await embedMany<ImageEmbeddingInput>({
  model: voyage.imageEmbeddingModel('voyage-multimodal-3'),
  values: [
    { image: 'https://i.ibb.co/nQNGqL0/beach1.jpg' },
    { image: 'https://i.ibb.co/r5w8hG8/beach2.jpg' },
  ],
});

for (const [index, embedding] of embeddings.entries()) {
  console.log(`Image ${index + 1}: ${embedding.length} dimensions`);
}

Input format options

The image embedding model accepts several input formats:
1
Simple string format
2
Pass image URLs or base64 strings directly:
3
import { createVoyage } from 'voyage-ai-provider';
import { embedMany } from 'ai';

const voyage = createVoyage({
  apiKey: process.env.VOYAGE_API_KEY,
});

const { embeddings } = await embedMany({
  model: voyage.imageEmbeddingModel('voyage-multimodal-3'),
  values: [
    'https://i.ibb.co/nQNGqL0/beach1.jpg',
    'https://i.ibb.co/r5w8hG8/beach2.jpg',
  ],
});
4
Object format with single image
5
Use the object format for clarity:
6
import { createVoyage } from 'voyage-ai-provider';
import { embedMany } from 'ai';
import type { ImageEmbeddingInput } from 'voyage-ai-provider';

const voyage = createVoyage({
  apiKey: process.env.VOYAGE_API_KEY,
});

const { embeddings } = await embedMany<ImageEmbeddingInput>({
  model: voyage.imageEmbeddingModel('voyage-multimodal-3'),
  values: [
    { image: 'https://i.ibb.co/nQNGqL0/beach1.jpg' },
    { image: 'https://i.ibb.co/r5w8hG8/beach2.jpg' },
  ],
});
7
Object format with multiple images
8
Combine multiple images into one embedding:
9
import { createVoyage } from 'voyage-ai-provider';
import { embedMany } from 'ai';
import type { ImageEmbeddingInput } from 'voyage-ai-provider';

const voyage = createVoyage({
  apiKey: process.env.VOYAGE_API_KEY,
});

const { embeddings } = await embedMany<ImageEmbeddingInput>({
  model: voyage.imageEmbeddingModel('voyage-multimodal-3'),
  values: [
    {
      image: [
        'https://i.ibb.co/nQNGqL0/beach1.jpg',
        'https://i.ibb.co/r5w8hG8/beach2.jpg',
      ],
    },
  ],
});

Configuration options

Customize image embedding behavior with provider options:
import { createVoyage } from 'voyage-ai-provider';
import { embed } from 'ai';
import type { VoyageMultimodalEmbeddingOptions } from 'voyage-ai-provider';

const voyage = createVoyage({
  apiKey: process.env.VOYAGE_API_KEY,
});

const { embedding } = await embed({
  model: voyage.imageEmbeddingModel('voyage-multimodal-3'),
  value: 'https://i.ibb.co/r5w8hG8/beach2.jpg',
  providerOptions: {
    voyage: {
      inputType: 'document',
      truncation: true,
    } satisfies VoyageMultimodalEmbeddingOptions,
  },
});

Available options

Type of the input. Defaults to "query".When specified, Voyage automatically prepends a prompt before vectorizing:
  • query - “Represent the query for retrieving supporting documents: ”
  • document - “Represent the document for retrieval: ”
Use this for retrieval/search tasks to optimize embedding quality.
The data type for output embeddings. Defaults to null.
  • null (default) - Embeddings as a list of floating-point numbers
  • base64 - Base64-encoded NumPy array of single-precision floats
See output data types FAQ for details.
Whether to truncate inputs to fit within the context length. Defaults to true.Set to false to raise an error instead of truncating when inputs exceed limits.

Use cases

Visual search

Find similar images by comparing embeddings

Image classification

Categorize images based on semantic content

Content moderation

Detect inappropriate or harmful visual content

Recommendation systems

Suggest related images based on visual similarity

Working with usage data

The embedding response includes token usage for images:
import { createVoyage } from 'voyage-ai-provider';
import { embedMany } from 'ai';

const voyage = createVoyage({
  apiKey: process.env.VOYAGE_API_KEY,
});

const result = await embedMany({
  model: voyage.imageEmbeddingModel('voyage-multimodal-3'),
  values: [
    'https://i.ibb.co/nQNGqL0/beach1.jpg',
    'https://i.ibb.co/r5w8hG8/beach2.jpg',
  ],
});

console.log(`Generated ${result.embeddings.length} embeddings`);
console.log(`Tokens used: ${result.usage?.tokens}`);
Image tokens are calculated based on pixel count and processing requirements.

Error handling

Handle errors when processing images:
import { createVoyage } from 'voyage-ai-provider';
import { embed } from 'ai';

const voyage = createVoyage({
  apiKey: process.env.VOYAGE_API_KEY,
});

try {
  const { embedding } = await embed({
    model: voyage.imageEmbeddingModel('voyage-multimodal-3'),
    value: 'https://example.com/image.jpg',
  });
  
  console.log('Image embedding generated successfully');
} catch (error) {
  console.error('Failed to generate image embedding:', error);
}
The maximum batch size is 128 embeddings per call. For large image collections, split into multiple requests.

Best practices

1
Optimize image size
2
Resize large images before encoding to reduce processing time and costs while maintaining quality.
3
Use URLs when possible
4
Prefer image URLs over base64 encoding to reduce request payload size and improve performance.
5
Batch processing
6
Use embedMany to process multiple images in a single API call for better efficiency.
7
Handle different formats
8
Support both URL and base64 formats in your application to accommodate different use cases.

Next steps

Multimodal embeddings

Combine text and images in a single embedding

Text embeddings

Learn about text embedding models

Configuration

Customize provider settings

API Reference

Explore the complete API

Build docs developers (and LLMs) love