Overview
CoroNet employs a sophisticated two-tier approach to license plate detection, combining the power of OpenAI’s GPT-4o vision model with a Tesseract OCR fallback system. This ensures maximum accuracy and reliability even when one system fails.Detection Architecture
The plate detection system operates in two stages:- Primary Detection: GPT-4o vision model analyzes the uploaded image
- Fallback Detection: If GPT-4o fails, Tesseract OCR processes the image
This dual-layer approach achieves high accuracy across various image qualities, lighting conditions, and plate formats.
How It Works
Stage 1: GPT-4o Vision Analysis
When a vehicle image is uploaded, CoroNet first attempts detection using OpenAI’s GPT-4o model:- Image is encoded as base64 for API transmission (app.py:47)
- Specific Spanish-language prompt instructs the model to extract only the plate text (app.py:58-60)
- Response is normalized: uppercased, cleaned of non-alphanumeric characters (except hyphens), and limited to 10 characters (app.py:74-75)
- Returns “NO_DETECTADA” on failure, triggering the fallback system (app.py:76)
Stage 2: Tesseract Fallback
If GPT-4o cannot detect a plate, the system automatically falls back to Tesseract OCR:- Only activates when GPT-4o returns “NO_DETECTADA” (app.py:106)
- Uses PIL to load the image (app.py:107)
- Applies English language model for better alphanumeric recognition (app.py:108)
- Performs aggressive text cleaning: removes spaces, newlines, and normalizes to uppercase (app.py:109-110)
The fallback system ensures that even if the primary AI model fails (due to API issues, complex images, or edge cases), the system still attempts detection using traditional OCR.
Text Normalization Pipeline
Both detection methods apply consistent text normalization:Normalization Steps
Normalization Steps
- Case Conversion: All text is converted to uppercase for consistency
- Whitespace Removal: Spaces and newlines are stripped
- Character Filtering: Only alphanumeric characters and hyphens are retained
- Length Limiting: Results are truncated to 10 characters maximum
- Empty Check: Returns “NO_DETECTADA” if no valid characters remain
Integration with Registration Flow
The detection process is seamlessly integrated into the registration workflow:Configuration Requirements
- Environment Variables
- Dependencies
The OpenAI API key must be configured in your The application loads this at startup (app.py:14-15):
.env file:Performance Characteristics
- GPT-4o Detection: Typically 1-3 seconds depending on image size and API latency
- Tesseract Fallback: Usually sub-second processing time
- Image Format: Saves as JPEG with timestamp-based naming (app.py:98)
- Storage: Images stored in
uploads/directory for later reference (app.py:19)
Error Handling
The system gracefully handles various failure scenarios:- API Failures: Catches OpenAI API exceptions and triggers Tesseract fallback
- Missing Images: Validates file upload before processing (app.py:94-96)
- Unreadable Plates: Returns “NO_DETECTADA” when both systems fail
- Network Issues: Falls back to local Tesseract processing
The timestamp-based filename format (
matricula_YYYYMMDD_HHMMSS.jpg) prevents naming conflicts and provides chronological organization.