Overview
VIGIA automatically ingests cases from:- Web Forms: Direct submission through the VIGIA web interface
- Email (IMAP/POP3/Gmail API): Automated polling and processing
- WhatsApp: Manual or API-based message intake
- OCR Processing: PDF/image document scanning and extraction
Web Form Intake
Direct Case Creation
Fill Required Fields
Complete the case intake form:
- Patient Information: DNI, initials, age, sex
- Product Information: Suspected product name and details
- Event Description: Adverse event narrative
- Reporter Contact: Email, phone, or WhatsApp
Generate Internal Code
The system automatically generates a unique internal code (e.g.,
LAB-2024-001) based on your configured prefixAPI Endpoint: GET /api/v1/icsr-caso-prueba/codigo-interno/next?prefix=LABAuto-filled Fields
When creating a case, VIGIA automatically:- Assigns the current date as
fecha_recepcion(reception date) - Sets
origen_notificacionto “web_form” - Creates an audit log entry
- Initializes followup thread
Email Intake
Configuration
Before using email intake, configure your mail server in Admin > Settings > Mail:Email Processing Workflow
Email Polling
VIGIA polls your inbox every 60 seconds (configurable) for new messagesService:
email_ingestion.py:ingest_emails()Message Parsing
For each new email:
- Extracts sender information (name, email)
- Parses message body (text/html)
- Processes attachments (PDF, images, documents)
- Extracts Message-ID for threading
OCR Processing
PDF and image attachments are automatically processed:Text is extracted using Tesseract OCR with Spanish + English language models
AI-Powered Extraction
The combined email body + OCR text is sent to an LLM for structured extraction:Service:
ingesta_service.py:build_icsr_payload_from_text()Extracted fields:- Patient demographics (age, sex, weight)
- Product information (name, dose, route)
- Event description and dates
- Reporter details
- Severity and causality indicators
Case Creation
If extraction is successful, an ICSR is automatically created with:
origen_notificacion= “email”reportante_contacto= sender emaildescripcion_evento= full narrative- Attachments saved to
storage/email_attachments/
Gmail API Integration
For Gmail accounts, use the Gmail API connector for better reliability:- Enable Gmail API in Google Cloud Console
- Download credentials JSON
- Configure in Admin > Ingest > Connectors > Gmail
- Set
MAIL_PROVIDER=GMAILin environment
- No IMAP/POP3 blocks
- Better rate limits
- Push notifications (optional)
- Thread preservation
WhatsApp Intake
WhatsApp messages can be processed manually or via API:Manual Entry
API Integration (WhatsApp Business)
For automated WhatsApp intake:- Set up WhatsApp Business API webhook
- Configure webhook URL:
https://your-vigia.com/api/v1/ingest/whatsapp - Handle incoming messages:
OCR Processing
Supported Document Types
- Adverse Event Forms: Pre-printed forms with patient/event data
- Medical Records: Hospital discharge summaries, lab results
- Regulatory Documents: Resolution documents, official reports
- Technical Sheets: Product information leaflets
OCR Workflow
Upload Document
Navigate to Cases > Upload DocumentSupported formats: PDF, PNG, JPG, TIFF (max 12MB)
Text Extraction
Each page is processed with Tesseract OCR:Headers and footers are automatically removed using pattern detection
Text Cleaning
The OCR output is cleaned:
- Dehyphenation of line breaks
- Removal of repeated headers/footers
- Unicode normalization
- Whitespace cleanup
Structure Extraction
For specific document types (e.g., resolution documents), VIGIA uses regex patterns to extract structured data:Service:
ocr_ingestor.py:ingest_resolucion()Extracts:- Resolution number and date
- Laboratory information
- Product registry details
- Approval status
OCR Quality Tips
Improving OCR Accuracy
Improving OCR Accuracy
- Use 300 DPI minimum for scanned documents
- Ensure good lighting and contrast
- Avoid skewed or rotated images
- Use PDF format when possible (better quality)
- For handwritten text, manual entry is recommended
Multi-page Documents
Multi-page Documents
VIGIA processes all pages automatically:
- Each page is OCR’d separately
- Text is concatenated with page markers
- Page images are stored for reference
- Navigate between pages in case detail view
Language Support
Language Support
Current OCR models:
- Spanish (
spa): Primary language - English (
eng): Technical/medical terms - Combined:
spa+engfor best results
Post-Intake Workflow
After a case is created through any channel:- Automatic Notifications: Configured users receive email alerts
- Status Set: Initial status is “Pendiente” (Pending)
- Evaluation Queue: Case appears in evaluation dashboard
- Duplicate Detection: System checks for similar cases by DNI, product, and date range
- Product Assignment: If product is recognized, links to product catalog
Case Validation
Before finalizing intake, VIGIA validates:Required fields: patient age/sex, product name, event description
Date logic: event date ≤ reception date
Contact format: valid email/phone/WhatsApp format
Duplicate check: warns if similar case exists
Troubleshooting
Email not being ingested
Email not being ingested
Check:
- Mail server credentials in Settings
- Inbox folder name (INBOX vs INBOX.VIGIA)
- Firewall allows IMAP/POP3 ports
- Check logs:
backend/logs/email_ingestion.log
GET /api/v1/admin/mail/testOCR extraction is poor
OCR extraction is poor
Common causes:
- Low resolution scan (< 150 DPI)
- Skewed or rotated image
- Poor contrast or lighting
- Handwritten text (not supported)
AI extraction missing fields
AI extraction missing fields
Reasons:
- Incomplete information in source text
- Ambiguous or unclear narrative
- Non-standard terminology