Default Model
Invoice OCR defaults togoogle/gemini-2.0-flash when OPENROUTER_MODEL is not set.
Why Gemini 2.0 Flash?
- Fast processing (typically 1-3 seconds per invoice)
- Excellent accuracy for structured documents
- Cost-effective at ~0.003 per invoice
- Native multimodal support (no image-to-text conversion)
- Large context window (1M tokens)
Model Comparison
Recommended Models
google/gemini-2.0-flash
Best for: Production use, high volume
- Speed: Fast (1-3s)
- Accuracy: High
- Cost: 0.003/invoice
- Context: 1M tokens
- Processing hundreds of invoices
- Real-time API integrations
- Cost-sensitive applications
openai/gpt-4o
Best for: Maximum accuracy
- Speed: Medium (3-5s)
- Accuracy: Highest
- Cost: 0.03/invoice
- Context: 128k tokens
- Complex invoices with handwriting
- Multi-page PDFs with tables
- Critical financial documents
anthropic/claude-3.5-sonnet
Best for: Complex reasoning
- Speed: Medium (3-5s)
- Accuracy: Very High
- Cost: 0.04/invoice
- Context: 200k tokens
- Invoices with complex tax calculations
- Multi-currency documents
- Detailed reconciliation needs
openai/gpt-4o-mini
Best for: Budget-conscious development
- Speed: Very Fast (under 2s)
- Accuracy: Good
- Cost: 0.001/invoice
- Context: 128k tokens
- Development and testing
- Simple invoices
- High-volume, low-stakes use cases
Performance Matrix
| Model | Speed | Accuracy | Cost (avg) | Best For |
|---|---|---|---|---|
google/gemini-2.0-flash | ⚡⚡⚡ | ⭐⭐⭐⭐ | $0.002 | Production default |
openai/gpt-4o | ⚡⚡ | ⭐⭐⭐⭐⭐ | $0.020 | Maximum accuracy |
openai/gpt-4o-mini | ⚡⚡⚡⚡ | ⭐⭐⭐ | $0.001 | Budget/testing |
anthropic/claude-3.5-sonnet | ⚡⚡ | ⭐⭐⭐⭐⭐ | $0.025 | Complex reasoning |
anthropic/claude-3-opus | ⚡ | ⭐⭐⭐⭐⭐ | $0.040 | Highest accuracy |
anthropic/claude-3-haiku | ⚡⚡⚡⚡ | ⭐⭐⭐ | $0.001 | Fast & cheap |
Costs are approximate and based on typical invoice processing (1-2 page documents with 500-2000 tokens of output). Actual costs vary based on document complexity, page count, and output verbosity.
Choosing a Model
By Use Case
Production API (High Volume)
Production API (High Volume)
Recommended:
google/gemini-2.0-flashWhy:- Processes 1000+ invoices/day cost-effectively
- Fast enough for real-time user experiences
- High accuracy for standard invoices
- Reliable and well-supported
.env.local
Critical Financial Documents
Critical Financial Documents
Recommended:
openai/gpt-4o or anthropic/claude-3.5-sonnetWhy:- Highest accuracy for complex invoices
- Better handling of edge cases
- More robust number extraction
- Strong reasoning for tax calculations
.env.local
Development & Testing
Development & Testing
Recommended:
openai/gpt-4o-miniWhy:- Very low cost during development
- Fast iteration cycles
- Good enough for testing workflows
- Upgrade to production model when deploying
.env.local
Complex Multi-Page Invoices
Complex Multi-Page Invoices
Recommended:
anthropic/claude-3.5-sonnetWhy:- 200k token context window
- Excellent multi-page reasoning
- Strong table extraction
- Detailed reconciliation capabilities
.env.local
Scanned/Low-Quality Images
Scanned/Low-Quality Images
Recommended:
openai/gpt-4oWhy:- Best OCR capabilities
- Handles blurry or rotated images
- Good with handwritten notes
- Robust to image quality issues
.env.local
By Document Type
| Document Type | Recommended Model | Reason |
|---|---|---|
| Standard tax invoices | gemini-2.0-flash | Fast, accurate, cost-effective |
| GST invoices (India) | gemini-2.0-flash | Native handling of complex schemas |
| Multi-page invoices | claude-3.5-sonnet | Large context, strong reasoning |
| Handwritten invoices | gpt-4o | Best OCR capabilities |
| Scanned PDFs | gpt-4o | Robust to image quality |
| Simple receipts | gpt-4o-mini | Overkill to use expensive models |
| International invoices | claude-3.5-sonnet | Multi-language, multi-currency |
Cost Optimization
Strategies
Use Cheaper Models for Simple Invoices
Route simple invoices to
gpt-4o-mini and complex ones to gpt-4o:Use Gemini for Production
gemini-2.0-flash offers the best cost/performance ratio:- 5-10x cheaper than GPT-4o
- Similar accuracy for structured documents
- Faster processing
Monitor Usage
Track costs per model in OpenRouter dashboard:
- Go to openrouter.ai/activity
- Filter by model
- Identify expensive requests
- Optimize or switch models
Cost Estimation
Example: Processing 1000 invoices/month| Model | Cost per Invoice | Monthly Cost |
|---|---|---|
gpt-4o-mini | $0.001 | $1 |
gemini-2.0-flash | $0.002 | $2 |
gemini-pro | $0.005 | $5 |
gpt-4o | $0.020 | $20 |
claude-3.5-sonnet | $0.025 | $25 |
claude-3-opus | $0.040 | $40 |
Hybrid approach: Use
gemini-2.0-flash for 90% of invoices and gpt-4o for the 10% that fail validation. This gives you high accuracy at ~$4/month for 1000 invoices.Per-Request Model Override
You can override the default model on a per-request basis:- Use different models for different invoice types
- Retry failed extractions with a more powerful model
- A/B test models for accuracy
Model Availability
Check available models in real-time:Via OpenRouter Dashboard
PDF-Specific Considerations
PDF Engine Selection
The PDF parsing engine can impact model performance:.env.local
pdf-text: Use with any model for digital PDFsmistral-ocr: Use withgemini-2.0-flashorgpt-4ofor scanned PDFsnative: Let the model provider handle PDF parsing
Multi-Page PDFs
Models with larger context windows handle multi-page invoices better:| Pages | Recommended Model | Context Needed |
|---|---|---|
| 1-2 | Any model | ~4k tokens |
| 3-5 | gemini-2.0-flash | ~20k tokens |
| 6-10 | claude-3.5-sonnet | ~50k tokens |
| 10+ | gemini-2.0-flash | ~100k+ tokens |
Troubleshooting
Model not found error
Model not found error
Error: “Model not found” or “Invalid model”Solutions:
- Check model ID spelling (case-sensitive)
- Verify model is available: openrouter.ai/models
- Try the
/api/modelsendpoint to see available vision models - Use a known-good model like
google/gemini-2.0-flash
Poor extraction quality
Poor extraction quality
Problem: Model not extracting data correctlySolutions:
- Try a more powerful model (
gpt-4oorclaude-3.5-sonnet) - Check image quality (resolution, rotation, clarity)
- Use
mistral-ocrPDF engine for scanned documents - Review the raw OCR output via
/api/ocrto diagnose issues
Slow processing times
Slow processing times
Problem: API requests taking too longSolutions:
- Switch to faster models (
gpt-4o-mini,gemini-2.0-flash) - Reduce PDF page count (extract relevant pages only)
- Use
pdf-textengine instead ofmistral-ocr - Check OpenRouter status: status.openrouter.ai
High costs
High costs
Problem: API costs higher than expectedSolutions:
- Switch default model to
gemini-2.0-flashorgpt-4o-mini - Implement caching for repeated PDFs
- Monitor usage per model in OpenRouter dashboard
- Set credit limits on API keys
Next Steps
Environment Variables
Configure OPENROUTER_MODEL in .env.local
API Reference
Learn how to override models per-request
Quick Start
Start processing invoices
OpenRouter Dashboard
Monitor usage and costs
