OCR Receipt Scanning

Overview

Upload photos of receipts, invoices, or tickets, and Sistema Financiero will automatically extract:

Total amount
Merchant/store name
Suggested category
List of items purchased
Transaction date

The OCR feature uses Google Gemini 2.5 Flash with vision capabilities for accurate text extraction from images.

The system validates that uploaded images are actually receipts before processing, preventing accidental uploads of unrelated photos.

How It Works

Upload image

Take or upload a photo of your receipt. The image is stored in Supabase Storage.

Image validation

AI checks if the image is a valid receipt, invoice, or purchase document.

If valid: proceeds to data extraction
If invalid: returns a helpful error message

OCR extraction

Gemini 2.5 Flash analyzes the image and extracts:

Total amount (monto)
Store name (comercio)
Suggested category (categoria_sugerida)
Items purchased (items)
Date (fecha)

Structured response

Data is returned in both human-readable and JSON formats for easy integration.

Example Extractions

Gas Station Receipt

Image uploaded: Receipt from Pemex AI Response:

📸 TICKET ANALIZADO:

💰 Monto: $450.50
🏪 Comercio: Pemex
📁 Categoría sugerida: Transporte
📋 Items: Magna Premium 30L, Total
📅 Fecha: 2025-10-06

📝 Descripción: Llenado de combustible en Pemex

Structured JSON:

{
  "es_ticket": true,
  "monto": 450.50,
  "comercio": "Pemex",
  "categoria_sugerida": "Transporte",
  "items": ["Magna Premium 30L", "Total"],
  "fecha": "2025-10-06",
  "descripcion": "Llenado de combustible en Pemex"
}

Grocery Store Receipt

Image uploaded: Receipt from Walmart AI Response:

📸 TICKET ANALIZADO:

💰 Monto: $350.00
🏪 Comercio: Walmart
📁 Categoría sugerida: Alimentación
📋 Items: Leche, Pan, Huevos, Verduras
📅 Fecha: 2025-10-09

📝 Descripción: Compra de despensa en Walmart

Restaurant Receipt

Image uploaded: Receipt from a restaurant AI Response:

📸 TICKET ANALIZADO:

💰 Monto: $280.00
🏪 Comercio: Restaurante La Casa
📁 Categoría sugerida: Alimentación
📋 Items: 2x Tacos, 1x Refresco, Propina
📅 Fecha: 2025-10-09

📝 Descripción: Comida en restaurante

Invalid Image Detection

If you upload an image that’s not a receipt (screenshot, random photo, etc.):

⚠️ IMAGEN NO RECONOCIDA COMO TICKET

Esta es una captura de pantalla de una conversación de texto, 
no un ticket o factura.

💡 Sugerencia: Sube una foto de un ticket, factura o recibo de 
compra para que pueda analizarlo.

Si quieres registrar algo manualmente, dime:
- ¿Es gasto o ingreso?
- Monto
- Comercio/Proveedor
- Categoría

The AI is trained to detect non-receipt images like chat screenshots, random photos, or unrelated documents to prevent false extractions.

Category Mapping

The AI automatically suggests the most appropriate category from your system’s valid categories:

Expense Categories

Alimentación (Food/Groceries)
Transporte (Transportation/Fuel)
Vivienda (Housing/Rent/Utilities)
Salud (Health/Medical)
Entretenimiento (Entertainment)
Educación (Education)
Otros Gastos (Other Expenses)

Income Categories

Salario (Salary)
Ventas (Sales)
Servicios (Services)
Inversiones (Investments)
Otros Ingresos (Other Income)

Technical Implementation

API Endpoint

POST /api/upload-image Request: multipart/form-data

{
  image: File // Image file from form upload
}

Response:

{
  success: true,
  url: string,        // Public URL of uploaded image
  analysis: string,   // Formatted text response
  data: {             // Structured JSON data
    es_ticket: boolean,
    monto?: number,
    comercio?: string,
    categoria_sugerida?: string,
    items?: string[],
    fecha?: string,
    descripcion?: string,
    razon?: string,      // If not a ticket
    sugerencia?: string  // If not a ticket
  }
}

Storage Flow

Upload to Supabase Storage:

const timestamp = Date.now()
const fileName = `ticket_${timestamp}_${image.name}`

await supabase.storage
  .from('facturas')
  .upload(fileName, imageBytes, {
    contentType: image.type,
  })

Get public URL:

const { data } = supabase.storage
  .from('facturas')
  .getPublicUrl(fileName)

See /app/api/upload-image/route.ts:17-37

Vision API Integration

The system sends the image to Gemini 2.5 Flash with a detailed extraction prompt:

await fetch('https://openrouter.ai/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.OPENROUTER_API_KEY}`,
    'Content-Type': 'application/json',
    'HTTP-Referer': process.env.NEXT_PUBLIC_SITE_URL,
    'X-Title': 'Sistema Financiero - OCR',
  },
  body: JSON.stringify({
    model: 'google/gemini-2.5-flash',
    messages: [
      {
        role: 'user',
        content: [
          {
            type: 'text',
            text: '...' // Detailed extraction prompt
          },
          {
            type: 'image_url',
            image_url: { url: imageUrl }
          }
        ]
      }
    ],
    max_tokens: 400,
    temperature: 0.1,
    response_format: { type: 'json_object' }
  })
})

See /app/api/upload-image/route.ts:40-129

OCR Prompt Engineering

The system uses a carefully crafted prompt to ensure accurate extraction:

Analiza esta imagen y determina si es un ticket/factura válido.

PASO 1: VALIDAR SI ES UN TICKET
- ¿La imagen muestra un ticket, factura, recibo o comprobante de compra?
- ¿Tiene información de comercio, monto, items comprados?
- Si es screenshot de chat, foto aleatoria, o documento que NO sea ticket
  → marca "es_ticket": false

INSTRUCCIONES SI ES TICKET:
1. Extrae el MONTO TOTAL (solo número, sin símbolos)
2. Identifica el COMERCIO/ESTABLECIMIENTO
3. Sugiere UNA categoría de la lista válida (la más apropiada)
4. Lista los items principales si son visibles
5. Extrae fecha si está visible

RESPONDE SOLO CON JSON (sin markdown, sin explicaciones)

Full prompt in /app/api/upload-image/route.ts:56-114

The response_format: { type: 'json_object' } parameter ensures structured JSON responses for reliable parsing.

Integration with Chat

The OCR feature integrates with the AI Chat Assistant:

// In chat API
if (images.length > 0) {
  openRouterMessages[openRouterMessages.length - 1].content += 
    `\n\n[El usuario subió ${images.length} imagen(es). 
    Analiza las imágenes para extraer información del ticket.]`
}

See /app/api/chat/route.ts:69-74

You can combine natural language with receipt images: upload a photo and say “registra este gasto” for seamless entry.

Error Handling

Upload Errors

if (uploadError) {
  throw new Error(`Upload error: ${uploadError.message}`)
}

Vision API Errors

if (!visionResponse.ok) {
  console.error('Vision API error:', errorText)
  throw new Error(`Vision API error: ${visionResponse.statusText}`)
}

JSON Parsing Fallback

try {
  ocrData = JSON.parse(analysisText)
} catch (parseError) {
  // Fallback to plain text
  ocrData = {
    monto: null,
    comercio: 'Desconocido',
    categoria_sugerida: 'Otros Gastos',
    items: [],
    descripcion: analysisText,
  }
}

See /app/api/upload-image/route.ts:145-158

Storage Bucket Configuration

Images are stored in the facturas bucket in Supabase Storage.

Ensure the facturas bucket exists and has proper public access policies configured in your Supabase project.

Model Selection

The system uses Gemini 2.5 Flash for optimal performance:

Fast processing (typically < 2 seconds)
Strong OCR capabilities
Multimodal support (text + images)
Cost-effective for high-volume usage

Best Practices

Good lighting

Take photos in well-lit conditions for better text recognition

Clear focus

Ensure the receipt is in focus and all text is readable

Full receipt

Capture the entire receipt including header and total

Flat surface

Place receipt on a flat surface to avoid distortion

Next Steps

AI Chat

Use natural language to register extracted transactions

Configuration

Configure your OpenRouter API key and model settings

Core Features

AI Features

Data Management

Overview

How It Works

Example Extractions

Gas Station Receipt

Grocery Store Receipt

Restaurant Receipt

Invalid Image Detection

Category Mapping

Expense Categories

Income Categories

Technical Implementation

API Endpoint

Storage Flow

Vision API Integration

OCR Prompt Engineering

Integration with Chat

Error Handling

Upload Errors

Vision API Errors

JSON Parsing Fallback

Storage Bucket Configuration

Model Selection

Best Practices

Good lighting

Clear focus

Full receipt

Flat surface

Next Steps

AI Chat

Configuration

Build docs developers (and LLMs) love

Core Features

AI Features

Data Management

​Overview

​How It Works

​Example Extractions

​Gas Station Receipt

​Grocery Store Receipt

​Restaurant Receipt

​Invalid Image Detection

​Category Mapping

​Expense Categories

​Income Categories

​Technical Implementation

​API Endpoint

​Storage Flow

​Vision API Integration

​OCR Prompt Engineering

​Integration with Chat

​Error Handling

​Upload Errors

​Vision API Errors

​JSON Parsing Fallback

​Storage Bucket Configuration

​Model Selection

​Best Practices

Good lighting

Clear focus

Full receipt

Flat surface

​Next Steps

AI Chat

Configuration

Build docs developers (and LLMs) love

Overview

How It Works

Example Extractions

Gas Station Receipt

Grocery Store Receipt

Restaurant Receipt

Invalid Image Detection

Category Mapping

Expense Categories

Income Categories

Technical Implementation

API Endpoint

Storage Flow

Vision API Integration

OCR Prompt Engineering

Integration with Chat

Error Handling

Upload Errors

Vision API Errors

JSON Parsing Fallback

Storage Bucket Configuration

Model Selection

Best Practices

Next Steps