Skip to main content
The /api/ocr-structured endpoint extracts invoice data into a structured JSON format compatible with MyBillBook. It includes automatic reconciliation to verify calculations and detect mismatches.

Endpoint

POST /api/ocr-structured

Request Body

imageBase64
string
required
Base64-encoded image data or data URLFormats:
  • Bare base64: iVBORw0KGgoAAAANS...
  • Data URL: data:image/png;base64,iVBORw0KGgo...
mimeType
string
default:"image/png"
MIME type of the image (e.g., image/png, image/jpeg, image/webp)
model
string
Override the default model. Examples: openai/gpt-4o, anthropic/claude-3.5-sonnet

Response Schema

The endpoint returns a structured invoice document with three main sections:

Voucher Object

voucher
object
Invoice header information and totals

Items Array

items
array
Line items extracted from the invoice

Party Object

party
object
Customer/buyer information

Example Requests

curl -X POST https://your-domain.com/api/ocr-structured \
  -H "Content-Type: application/json" \
  -d '{
    "imageBase64": "data:image/png;base64,iVBORw0KGgo...",
    "model": "openai/gpt-4o-mini"
  }'

Example Response

Success (200)
{
  "voucher": {
    "invoice_number": "INV-2024-001",
    "invoice_date": "15-01-2024",
    "invoice_due_date": "30-01-2024",
    "invoice_discount": "0",
    "invoice_discount_mode": "",
    "round_off": "0.00",
    "total_invoice_amount": "11800.00",
    "additional_charges": [
      {
        "name": "Freight",
        "amount": "500.00",
        "tax_rate": "18",
        "amount_includes_tax": false
      }
    ],
    "reconciliation": {
      "status": "matched",
      "method": {
        "discount_mode": "before_tax",
        "charges_inclusive": false
      },
      "computed_grand_total": "11800.00",
      "difference": "0.00",
      "items_taxable_total": "10000.00",
      "items_tax_total": "1800.00",
      "charges_taxable_total": "500.00",
      "charges_tax_total": "90.00"
    }
  },
  "items": [
    {
      "name": "Premium Widget",
      "hsn_sac_code": "8517",
      "quantity": "10",
      "unit": "PCS",
      "price": "1000.00",
      "discount_rate": "0",
      "tax_rate": "18"
    }
  ],
  "party": {
    "party_name": "Acme Corporation",
    "party_gstin_number": "27AABCU9603R1ZM",
    "party_pan_number": "AABCU9603R",
    "party_address": "123 Business Street",
    "party_city": "Mumbai",
    "party_pincode": "400001"
  }
}
Error (400)
{
  "error": "Missing 'imageBase64' in request body"
}

Reconciliation Logic

The endpoint automatically runs reconciliation using lib/invoice.ts:
  1. Parse extracted JSON from model response
  2. Compute line item totals: taxable = qty × price × (1 - discount%); tax = taxable × tax_rate; total = taxable + tax
  3. Apply voucher discount before or after tax based on invoice_discount_mode
  4. Add charges with their GST
  5. Apply round-off adjustment
  6. Compare computed total with total_invoice_amount
  7. Set status: "matched" if difference ≤ 0.05, else "unmatched"

Schema Constraints

The system prompt enforces these rules:
  • Numbers: Strings without commas (e.g., "1234.50")
  • Dates: dd-mm-yyyy format (converts from other formats)
  • Unknown values: Empty string ""
  • Tax-exclusive prices: item.price must be pre-GST; model converts if needed
  • Discount combination: Multiple discounts combined multiplicatively
  • Round-off: Typically ±1.00 unless explicitly printed

Limitations

This endpoint only supports images. For PDF support, use /api/ocr-structured-v4.
  • Image input only (no PDF support)
  • Simpler schema than v4 (no header discounts, TCS, etc.)
  • Less sophisticated reconciliation than v4
  • Better suited for basic invoices

Migration to V4

For India GST invoices with complex scenarios, migrate to /api/ocr-structured-v4:
  • PDF support with multi-page processing
  • Header-level discounts (trade, special)
  • TCS (Tax Collected at Source)
  • HSN-wise tax tables
  • Advanced reconciliation with multiple strategies
  • Confidence scoring per field

Next Steps

V4 Schema

Upgrade to advanced India GST schema

Reconciliation Logic

Learn how calculation verification works

Build docs developers (and LLMs) love