Multimodal Search

Overview

OmniSearches supports multimodal search, allowing you to upload images alongside your text queries. This feature enables visual search scenarios where images provide crucial context that text alone cannot capture.

Multimodal search is available via the POST /api/search endpoint and through the web interface using the paperclip icon.

How It Works

The multimodal search feature combines image and text inputs to provide more contextual results:

Image Upload: Upload up to 4 images (JPEG, PNG, GIF, or WebP)
Context Integration: Images are encoded and sent to Gemini 2.0 Flash
Combined Analysis: The AI analyzes both visual and textual information
Enhanced Results: Receive answers that consider both image context and your query

Maximum file limit: 4 images per search. Each image is base64-encoded and sent with your query.

Supported Image Formats

JPEG

.jpg, .jpeg - Standard photo format

PNG

.png - Lossless format with transparency

GIF

.gif - Animated or static graphics

WebP

.webp - Modern web format

Usage Examples

Via Web Interface

Navigate to the OmniSearches homepage
Click the paperclip icon (📎) in the search bar
Select up to 4 images from your device
Preview your uploaded images
Enter your text query
Click Search to get results

You can remove individual images from the preview by clicking the X button on each image thumbnail.

Via API

Send images as base64-encoded strings in the user_images array:

POST /api/search

const response = await fetch('/api/search', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    query: 'What type of flower is this?',
    mode: 'default',
    user_images: [
      {
        data: 'base64_encoded_image_data_here',
        mimeType: 'image/jpeg'
      }
    ]
  })
});

Using Fetch with File Input

Upload and Search

async function searchWithImages(query: string, files: File[]) {
  // Convert files to base64
  const imagePromises = files.map(file => {
    return new Promise((resolve, reject) => {
      const reader = new FileReader();
      reader.onload = (e) => {
        const base64 = e.target?.result as string;
        // Remove data URL prefix
        const data = base64.split(',')[1];
        resolve({
          data,
          mimeType: file.type
        });
      };
      reader.onerror = reject;
      reader.readAsDataURL(file);
    });
  });

  const user_images = await Promise.all(imagePromises);

  const response = await fetch('/api/search', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      query,
      mode: 'default',
      user_images
    })
  });

  return response.json();
}

Use Cases

Visual Identification

Upload photos of plants, animals, objects, or landmarks to identify them and learn more.Example: “What species of bird is this?” + image of a bird

Product Research

Search for products similar to ones you’ve photographed or saved.Example: “Where can I buy this style of furniture?” + image of furniture

Document Analysis

Upload screenshots, diagrams, or documents for analysis and explanation.Example: “Explain this architecture diagram” + diagram image

Troubleshooting

Share error messages, UI issues, or hardware problems visually.Example: “How do I fix this error?” + screenshot of error

Art & Design

Analyze artwork, design patterns, or creative works for inspiration.Example: “What art movement is this?” + image of painting

Image Processing

When you upload images, the following process occurs:

Client-Side Processing

Images are processed entirely in the browser:

client/src/hooks/useImageUpload.ts

import { useImageStore } from '@/store/imageStore';

export function useImageUpload() {
  const { addImage } = useImageStore();

  const handleImageUpload = async (files: File[]) => {
    // Maximum 4 images
    if (files.length > 4) {
      throw new Error('Maximum 4 images allowed');
    }

    // Convert to base64
    for (const file of files) {
      const reader = new FileReader();
      reader.onload = (e) => {
        const base64 = e.target?.result as string;
        addImage({
          id: Math.random().toString(36),
          data: base64.split(',')[1],
          mimeType: file.type
        });
      };
      reader.readAsDataURL(file);
    }
  };

  return { handleImageUpload };
}

Server-Side Integration

The server includes images in the chat history when creating a Gemini session:

server/routes.ts (Line 398-423)

if (user_images && user_images.length > 0) {
  chat = model.startChat({
    tools: [{ google_search: {} }],
    history: [
      {
        role: "user",
        parts: [
          ...user_images.map((img: UserImage) => ({
            inlineData: {
              data: img.data,
              mimeType: img.mimeType
            }
          })),
          { text: 'Use uploaded images to search for information' }
        ]
      }
    ]
  });
}

Limitations

Maximum 4 images per search
Each image must be under 10MB
Supported formats: JPEG, PNG, GIF, WebP
Images are not persisted server-side

Best Practices

Clear Images

Use high-quality, well-lit images for best results

Relevant Context

Ensure images directly relate to your text query

Descriptive Queries

Combine images with clear, descriptive text

Appropriate Size

Resize large images to improve upload speed

Privacy & Security

Images are processed in real-time and are not stored on the server. All image data is:

Transmitted securely via HTTPS
Processed only for the current search session
Discarded after the response is generated
Not logged or saved to disk

Troubleshooting

Upload fails with 'too many files' error

You can only upload 4 images at a time. Remove some images and try again.

'Only image files are allowed' error

Ensure your files are JPEG, PNG, GIF, or WebP format. Other file types are not supported.

Images not appearing in search results

Check that your query references the images. Try rephrasing like “Based on this image…” or “What is shown in this photo?”

Slow upload times

Large images take longer to upload. Consider resizing images to under 2MB for faster uploads.

Search Modes

Choose the right mode for your multimodal search

API Reference

Complete API documentation for image uploads

Get Started

Features

Configuration

Development

Overview

How It Works

Supported Image Formats

JPEG

PNG

GIF

WebP

Usage Examples

Via Web Interface

Via API

Using Fetch with File Input

Use Cases

Image Processing

Client-Side Processing

Server-Side Integration

Limitations

Best Practices

Clear Images

Relevant Context

Descriptive Queries

Appropriate Size

Privacy & Security

Troubleshooting

Search Modes

API Reference

Build docs developers (and LLMs) love

Get Started

Features

Configuration

Development

​Overview

​How It Works

​Supported Image Formats

JPEG

PNG

GIF

WebP

​Usage Examples

​Via Web Interface

​Via API

​Using Fetch with File Input

​Use Cases

​Image Processing

​Client-Side Processing

​Server-Side Integration

​Limitations

​Best Practices

Clear Images

Relevant Context

Descriptive Queries

Appropriate Size

​Privacy & Security

​Troubleshooting

​Related Features

Search Modes

API Reference

Build docs developers (and LLMs) love

Overview

How It Works

Supported Image Formats

Usage Examples

Via Web Interface

Via API

Using Fetch with File Input

Use Cases

Image Processing

Client-Side Processing

Server-Side Integration

Limitations

Best Practices

Privacy & Security

Troubleshooting

Related Features