Sintesis can automatically extract structured data from uploaded PDF documents using an AI-backed OCR pipeline. Templates define which regions of a document contain which data, and the extraction engine uses those regions to prompt a vision model and map results back to tabla columns.
Architecture overview
User uploads PDF to obra folder
│
▼
PDF → rendered image
│
▼
Regions drawn on image
(annotated as base64 data URL)
│
▼
POST /api/ocr-playground
or /api/obras/[id]/tablas/import/ocr-multi
│
▼
generateObject() → gpt-4o-mini
(Vercel AI SDK, temperature 0.1)
│
▼
Structured JSON returned
per region
│
▼
Rows inserted into obra_tabla
OCR templates
Templates are stored in the ocr_templates table. Each template belongs to a tenant, defines a reference document, and contains an array of annotated regions.
Schema
CREATE TABLE ocr_templates (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
name TEXT NOT NULL,
description TEXT,
-- Reference document stored in Supabase Storage
template_bucket TEXT,
template_path TEXT,
template_file_name TEXT,
-- Rendered image dimensions for coordinate scaling
template_width INTEGER,
template_height INTEGER,
-- Extraction regions as a JSON array
regions JSONB NOT NULL DEFAULT '[]',
-- Column definitions derived from regions
columns JSONB NOT NULL DEFAULT '[]',
is_active BOOLEAN NOT NULL DEFAULT true,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
Template names must be unique per tenant among active templates. The uniqueness constraint uses a partial index so inactive (soft-deleted) templates can share names:
CREATE UNIQUE INDEX ocr_templates_name_unique
ON ocr_templates (tenant_id, name)
WHERE is_active = true;
Deleting a template is a soft delete: is_active is set to false. The record is preserved for historical reference in ocr_document_processing.
Region definition
Each entry in the regions array describes a rectangular bounding box on the reference document image:
type Region = {
id: string; // unique region identifier
x: number; // left edge, in rendered pixels
y: number; // top edge, in rendered pixels
width: number; // box width, in rendered pixels
height: number; // box height, in rendered pixels
label: string; // human-readable field name
description?: string;
color: string; // display colour for the UI overlay
type: "single" | "table";
pageNumber?: number; // 1-indexed page (omit for page 1)
tableColumns?: string[]; // column names for table regions
};
| Region type | Extraction result |
|---|
single | One text value extracted from the bounding box |
table | An array of row objects, one per visible row inside the box |
Column definitions
When a template is saved, the API derives a columns array from the regions automatically. Single-type regions produce one column with ocrScope: "parent". Table-type regions produce one column per tableColumns entry with ocrScope: "item":
type TemplateColumn = {
fieldKey: string; // snake_case key derived from label
label: string; // display label
dataType: string; // always "text" (extensible)
ocrScope?: "parent" | "item";
description?: string;
};
Document processing table
Every document that enters the OCR pipeline gets a record in ocr_document_processing:
CREATE TABLE ocr_document_processing (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tabla_id UUID NOT NULL REFERENCES obra_tablas(id) ON DELETE CASCADE,
obra_id UUID NOT NULL REFERENCES obras(id) ON DELETE CASCADE,
-- Source document
source_bucket TEXT NOT NULL,
source_path TEXT NOT NULL,
source_file_name TEXT NOT NULL,
-- Lifecycle
status TEXT NOT NULL DEFAULT 'pending'
CHECK (status IN ('pending', 'processing', 'completed', 'failed')),
error_message TEXT,
rows_extracted INTEGER DEFAULT 0,
-- Template used
template_id UUID REFERENCES ocr_templates(id) ON DELETE SET NULL,
-- Performance tracking
processed_at TIMESTAMPTZ,
processing_duration_ms INTEGER,
retry_count INTEGER NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
| Status | Meaning |
|---|
pending | Document queued, not yet started |
processing | Vision model call in progress |
completed | Extraction succeeded; rows inserted into tabla |
failed | Extraction failed; see error_message |
OCR Playground
The playground lets you test extraction against any annotated image without persisting results. It is available at POST /api/ocr-playground.
{
"annotatedImageDataUrl": "data:image/png;base64,...",
"regions": [
{
"id": "r1",
"x": 120,
"y": 45,
"width": 300,
"height": 40,
"label": "Número de contrato",
"color": "#f97316",
"type": "single"
},
{
"id": "r2",
"x": 80,
"y": 200,
"width": 500,
"height": 300,
"label": "Ítems de certificado",
"color": "#3b82f6",
"type": "table",
"tableColumns": ["Código", "Descripción", "Cantidad", "Precio unitario", "Total"]
}
]
}
annotatedImageDataUrl must be a base64 data URL that includes the bounding-box overlays already drawn on the image. The UI renders these overlays client-side before sending the request.
{
"ok": true,
"results": [
{
"id": "r1",
"label": "Número de contrato",
"type": "single",
"text": "CONT-2024-00412",
"color": "#f97316"
},
{
"id": "r2",
"label": "Ítems de certificado",
"type": "table",
"rows": [
{ "Código": "01.01", "Descripción": "Hormigón H30", "Cantidad": "15.5", "Precio unitario": "12500", "Total": "193750" },
{ "Código": "01.02", "Descripción": "Armadura", "Cantidad": "420", "Precio unitario": "850", "Total": "357000" }
],
"color": "#3b82f6"
}
]
}
AI model and prompting
Extraction uses GPT-4o mini via the Vercel AI SDK generateObject function:
const res = await generateObject({
model: openai("gpt-4o-mini"),
schema: extractionSchema, // Zod schema derived from regions
messages: [
{
role: "user",
content: [
{ type: "text", text: instructions },
{ type: "image", image: annotatedImageDataUrl },
],
},
],
temperature: 0.1,
});
The model receives numbered boxes drawn on the image. Single-value fields are labelled [N] and table regions are labelled [N]📊. The prompt instructs the model to:
- Extract text exactly as it appears, without interpretation.
- Return
null for empty or illegible cells.
- For table regions, extract every visible row as a separate object.
Token cost estimation
// lib/ai-pricing.ts
export const AI_MODEL_COST_PER_1K_TOKENS: Record<string, number> = {
"gpt-4o-mini": 0.00015, // USD per 1K tokens
};
export function estimateUsdForTokens(
model: string | null,
tokens: number
): number | null {
const rate = AI_MODEL_COST_PER_1K_TOKENS[model ?? ""];
if (!rate) return null;
return (tokens / 1000) * rate;
}
GPT-4o mini is used intentionally over GPT-4o for cost efficiency. The structured generateObject schema dramatically reduces output token count by constraining the response shape.
Managing templates via the API
List active templates
Returns all active templates for the authenticated user’s tenant, ordered by name.
Response
{
"templates": [
{
"id": "<uuid>",
"name": "Certificado de avance",
"description": "Extrae número, período y tabla de ítems",
"template_file_name": "cert-modelo.pdf",
"regions": [...],
"columns": [...],
"is_active": true,
"created_at": "2025-09-12T14:00:00Z"
}
]
}
Create a template
POST /api/ocr-templates
Content-Type: application/json
{
"name": "Certificado de avance",
"description": "Extrae número, período y tabla de ítems",
"templateBucket": "ocr-templates",
"templatePath": "tenant-abc/cert-modelo.pdf",
"templateFileName": "cert-modelo.pdf",
"templateWidth": 1240,
"templateHeight": 1754,
"regions": [
{
"id": "r1",
"x": 80, "y": 110, "width": 250, "height": 35,
"label": "Número de certificado",
"color": "#f97316",
"type": "single",
"pageNumber": 1
}
]
}
Every region must include id, label, x, y, width, and height. Regions that fail validation are silently dropped. The request is rejected with 400 if no valid regions remain.
If a template with the same name already exists (and is active) for the tenant, the API returns 409 with code: "template_name_exists".
Delete (deactivate) a template
DELETE /api/ocr-templates
Content-Type: application/json
{ "id": "<template-uuid>" }
Sets is_active = false. The template is removed from all listing and assignment UI but its id is preserved in historical ocr_document_processing records.
Assigning templates to default tablas
Templates can be pre-assigned to obra default table configurations:
-- obra_default_tablas.ocr_template_id links to ocr_templates
ALTER TABLE obra_default_tablas
ADD COLUMN ocr_template_id UUID REFERENCES ocr_templates(id) ON DELETE SET NULL;
When a new obra is created from defaults, the associated template is carried over so document uploads against that tabla are automatically processed with the right template.
Row-level security
-- Tenant members can view and manage their own templates
CREATE POLICY "Users can view OCR templates for their tenant"
ON ocr_templates FOR SELECT
USING (
tenant_id IN (
SELECT tenant_id FROM memberships WHERE user_id = auth.uid()
)
);
Document processing records inherit access control through their parent obra_id, which is scoped to the tenant.