Overview
OmniSearches supports multimodal search, allowing you to upload images alongside your text queries. This feature enables visual search scenarios where images provide crucial context that text alone cannot capture.Multimodal search is available via the POST
/api/search endpoint and through the web interface using the paperclip icon.How It Works
The multimodal search feature combines image and text inputs to provide more contextual results:- Image Upload: Upload up to 4 images (JPEG, PNG, GIF, or WebP)
- Context Integration: Images are encoded and sent to Gemini 2.0 Flash
- Combined Analysis: The AI analyzes both visual and textual information
- Enhanced Results: Receive answers that consider both image context and your query
Maximum file limit: 4 images per search. Each image is base64-encoded and sent with your query.
Supported Image Formats
JPEG
.jpg, .jpeg - Standard photo formatPNG
.png - Lossless format with transparencyGIF
.gif - Animated or static graphicsWebP
.webp - Modern web formatUsage Examples
Via Web Interface
- Navigate to the OmniSearches homepage
- Click the paperclip icon (📎) in the search bar
- Select up to 4 images from your device
- Preview your uploaded images
- Enter your text query
- Click Search to get results
Via API
Send images as base64-encoded strings in theuser_images array:
POST /api/search
Using Fetch with File Input
Upload and Search
Use Cases
Visual Identification
Visual Identification
Upload photos of plants, animals, objects, or landmarks to identify them and learn more.Example: “What species of bird is this?” + image of a bird
Product Research
Product Research
Search for products similar to ones you’ve photographed or saved.Example: “Where can I buy this style of furniture?” + image of furniture
Document Analysis
Document Analysis
Upload screenshots, diagrams, or documents for analysis and explanation.Example: “Explain this architecture diagram” + diagram image
Troubleshooting
Troubleshooting
Share error messages, UI issues, or hardware problems visually.Example: “How do I fix this error?” + screenshot of error
Art & Design
Art & Design
Analyze artwork, design patterns, or creative works for inspiration.Example: “What art movement is this?” + image of painting
Image Processing
When you upload images, the following process occurs:Client-Side Processing
Images are processed entirely in the browser:client/src/hooks/useImageUpload.ts
Server-Side Integration
The server includes images in the chat history when creating a Gemini session:server/routes.ts (Line 398-423)
Limitations
Best Practices
Clear Images
Use high-quality, well-lit images for best results
Relevant Context
Ensure images directly relate to your text query
Descriptive Queries
Combine images with clear, descriptive text
Appropriate Size
Resize large images to improve upload speed
Privacy & Security
Images are processed in real-time and are not stored on the server. All image data is:
- Transmitted securely via HTTPS
- Processed only for the current search session
- Discarded after the response is generated
- Not logged or saved to disk
Troubleshooting
Upload fails with 'too many files' error
Upload fails with 'too many files' error
You can only upload 4 images at a time. Remove some images and try again.
'Only image files are allowed' error
'Only image files are allowed' error
Ensure your files are JPEG, PNG, GIF, or WebP format. Other file types are not supported.
Images not appearing in search results
Images not appearing in search results
Check that your query references the images. Try rephrasing like “Based on this image…” or “What is shown in this photo?”
Slow upload times
Slow upload times
Large images take longer to upload. Consider resizing images to under 2MB for faster uploads.
Related Features
Search Modes
Choose the right mode for your multimodal search
API Reference
Complete API documentation for image uploads