Overview
Invoice OCR offers three distinct OCR modes, each optimized for different use cases. Understanding when to use each mode helps you balance accuracy, performance, and debugging needs.Raw Text
Plain text extraction for debugging
Structured
Legacy schema with basic reconciliation
Structured V4
Advanced India GST with multi-pass reconciliation
Raw Text Mode
When to Use
- Debugging: Verify what the model actually sees before structuring
- Quick inspection: Check if text is being extracted correctly from images/PDFs
- Custom parsing: Build your own structured extraction pipeline
API Endpoint
Request Format
- Image
- PDF
- PDF URL
Response
Example Usage
Raw OCR mode preserves line breaks and spacing but does not structure the data. Use this to verify extraction quality before attempting structured parsing.
Structured Mode
When to Use
- Simple invoices: Basic line items without complex discounts
- Legacy integration: Existing systems using the compact schema
- After-tax discounts: Invoices where discounts apply after GST calculation
API Endpoint
Schema
The legacy structured mode returns a compact schema with three main sections:Reconciliation Logic
The structured mode tries multiple assumptions to match the printed total:Discount Mode Assumptions
Discount Mode Assumptions
- Before Tax: Voucher discount reduces taxable amount before GST calculation
- After Tax: Voucher discount applies to the final amount after GST
Charges Interpretation
Charges Interpretation
Additional charges (Freight, Packing, etc.) can be:
- Inclusive:
amountincludes GST → derive taxable and tax amounts - Exclusive:
amountis pre-tax → add GST on top
Response Example
Structured V4 Mode
When to Use
- India GST compliance: Full HSN/SAC support with CGST/SGST/IGST splits
- Complex discounts: Multiple cascading discounts, flat discounts, header discounts
- Multi-page PDFs: Process invoices spanning multiple pages with tax tables
- High accuracy: Multi-pass reconciliation with printed anchor matching
API Endpoint
Key Features
Multi-Pass Reconciliation
Tests multiple price mode interpretations (WITH_TAX, WITHOUT_TAX) and picks the best match
HSN Tax Table Anchoring
Scales item buckets to match printed per-rate taxable values when HSN table exists
Smart Discount Allocation
Allocates header discounts intelligently across GST rate buckets to match printed GST total
CGST/SGST/IGST Split
Automatically determines intra vs. inter-state tax split based on GSTIN state codes
Schema
The V4 schema provides comprehensive invoice representation:Request Format
- Basic
- With Annotations
- Custom PDF Engine
Response Example
Mode Comparison
| Feature | Raw Text | Structured | Structured V4 |
|---|---|---|---|
| Text Extraction | ✅ | ✅ | ✅ |
| JSON Schema | ❌ | ✅ (Compact) | ✅ (Normalized) |
| Reconciliation | ❌ | ✅ (Basic) | ✅ (Multi-pass) |
| India GST Support | ❌ | ⚠️ (Partial) | ✅ (Full) |
| HSN/SAC Codes | ❌ | ✅ | ✅ + Table anchoring |
| CGST/SGST/IGST Split | ❌ | ❌ | ✅ (Automatic) |
| Multiple Discounts | ❌ | ❌ | ✅ (Sequential) |
| Header Discounts | ❌ | ✅ | ✅ + Smart allocation |
| Multi-page PDFs | ✅ | ⚠️ | ✅ (Optimized) |
| Confidence Scores | ❌ | ❌ | ✅ |
| Use Case | Debugging | Simple invoices | Production GST |
Best Practices
Start with Raw Text for Unknown Formats
Start with Raw Text for Unknown Formats
When processing invoices from a new vendor or format:
- Call
/api/ocrto verify text extraction quality - Check if all important fields are visible in the raw text
- If text is missing, try different PDF engines (
pdf-text,mistral-ocr,native) - Once satisfied, switch to structured V4 for production
Use Structured V4 for Production
Use Structured V4 for Production
For India GST invoices in production:
- Always use
/api/ocr-structured-v4 - Provides the most accurate reconciliation
- Handles edge cases like HSN tax tables, multi-discount structures
- Returns confidence scores for validation
Leverage Annotations for PDFs
Leverage Annotations for PDFs
When processing large PDFs:
- First request parses the PDF (slow, costs more)
- OpenRouter returns
annotationsin the response - Subsequent requests with same PDF + annotations skip re-parsing
- Significant cost and latency savings for iterative processing
Monitor Reconciliation Errors
Monitor Reconciliation Errors
Check
reconciliation.error_absolute in V4 responses:- ≤ ₹0.05: Perfect match, high confidence
- ₹0.05 - ₹1.00: Acceptable, likely rounding differences
- > ₹1.00: Review required, possible extraction issues
reconciliation.alternates_considered to debug why specific interpretations were chosen.Related Topics
Reconciliation Engine
Deep dive into V4 reconciliation logic
PDF Support
Configure PDF parsing engines and optimize performance
