AI-Powered Plate Detection

Overview

CoroNet employs a sophisticated two-tier approach to license plate detection, combining the power of OpenAI’s GPT-4o vision model with a Tesseract OCR fallback system. This ensures maximum accuracy and reliability even when one system fails.

Detection Architecture

The plate detection system operates in two stages:

Primary Detection: GPT-4o vision model analyzes the uploaded image
Fallback Detection: If GPT-4o fails, Tesseract OCR processes the image

This dual-layer approach achieves high accuracy across various image qualities, lighting conditions, and plate formats.

How It Works

Stage 1: GPT-4o Vision Analysis

When a vehicle image is uploaded, CoroNet first attempts detection using OpenAI’s GPT-4o model:

def extract_plate_from_image(image_path):
    try:
        # Convert image to base64 for API transmission
        with open(image_path, "rb") as f:
            img_base64 = base64.b64encode(f.read()).decode("utf-8")

        # Send to GPT-4o with specific instructions
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "text",
                            "text": (
                                "Extrae únicamente el texto de la matrícula visible "
                                "en esta imagen (placa de vehículo). Devuelve solo la matrícula, "
                                "sin texto adicional, comentarios ni símbolos extras."
                            ),
                        },
                        {
                            "type": "image_url",
                            "image_url": {
                                "url": f"data:image/jpeg;base64,{img_base64}"
                            },
                        },
                    ],
                }
            ],
        )

        # Extract and clean the detected text
        text = response.choices[0].message.content.strip().upper()
        clean = "".join([c for c in text if c.isalnum() or c == "-"])[:10]
        return clean or "NO_DETECTADA"

    except Exception as e:
        print(" Error OCR con OpenAI:", e)
        return "NO_DETECTADA"

Key aspects of GPT-4o detection:

Image is encoded as base64 for API transmission (app.py:47)
Specific Spanish-language prompt instructs the model to extract only the plate text (app.py:58-60)
Response is normalized: uppercased, cleaned of non-alphanumeric characters (except hyphens), and limited to 10 characters (app.py:74-75)
Returns “NO_DETECTADA” on failure, triggering the fallback system (app.py:76)

Stage 2: Tesseract Fallback

If GPT-4o cannot detect a plate, the system automatically falls back to Tesseract OCR:

# Fallback with pytesseract
if matricula == "NO_DETECTADA":
    image = Image.open(path)
    ocr_text = pytesseract.image_to_string(image, lang="eng")
    ocr_text = ocr_text.strip().replace(" ", "").replace("\n", "").upper()
    matricula = "".join([c for c in ocr_text if c.isalnum() or c == "-"])[:10]

Fallback characteristics:

Only activates when GPT-4o returns “NO_DETECTADA” (app.py:106)
Uses PIL to load the image (app.py:107)
Applies English language model for better alphanumeric recognition (app.py:108)
Performs aggressive text cleaning: removes spaces, newlines, and normalizes to uppercase (app.py:109-110)

The fallback system ensures that even if the primary AI model fails (due to API issues, complex images, or edge cases), the system still attempts detection using traditional OCR.

Text Normalization Pipeline

Both detection methods apply consistent text normalization:

Normalization Steps

Case Conversion: All text is converted to uppercase for consistency
Whitespace Removal: Spaces and newlines are stripped
Character Filtering: Only alphanumeric characters and hyphens are retained
Length Limiting: Results are truncated to 10 characters maximum
Empty Check: Returns “NO_DETECTADA” if no valid characters remain

This ensures that regardless of which detection method succeeds, the output format is consistent and database-ready.

Integration with Registration Flow

The detection process is seamlessly integrated into the registration workflow:

@app.route("/guardar", methods=["POST"])
def guardar():
    # ... file upload handling ...
    
    # Save uploaded image
    filename = f"matricula_{datetime.now().strftime('%Y%m%d_%H%M%S')}.jpg"
    path = os.path.join(UPLOADS_DIR, filename)
    file.save(path)

    # Attempt GPT-4o detection
    matricula = extract_plate_from_image(path)

    # Fallback to Tesseract if needed
    if matricula == "NO_DETECTADA":
        image = Image.open(path)
        ocr_text = pytesseract.image_to_string(image, lang="eng")
        ocr_text = ocr_text.strip().replace(" ", "").replace("\n", "").upper()
        matricula = "".join([c for c in ocr_text if c.isalnum() or c == "-"])[:10]

    # Store in CSV with detected plate
    rows.append({
        "id": new_id,
        "fecha_hora": fecha,
        "matricula": matricula,
        # ... other fields ...
    })

Configuration Requirements

Environment Variables
Dependencies

The OpenAI API key must be configured in your .env file:

OPENAI_API_KEY=sk-your-key-here

The application loads this at startup (app.py:14-15):

load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

Required Python packages:

pip install openai pillow pytesseract python-dotenv

System requirements:

Tesseract OCR engine must be installed on the system
For Ubuntu/Debian: apt-get install tesseract-ocr
For macOS: brew install tesseract

Performance Characteristics

GPT-4o Detection: Typically 1-3 seconds depending on image size and API latency
Tesseract Fallback: Usually sub-second processing time
Image Format: Saves as JPEG with timestamp-based naming (app.py:98)
Storage: Images stored in uploads/ directory for later reference (app.py:19)

Error Handling

The system gracefully handles various failure scenarios:

API Failures: Catches OpenAI API exceptions and triggers Tesseract fallback
Missing Images: Validates file upload before processing (app.py:94-96)
Unreadable Plates: Returns “NO_DETECTADA” when both systems fail
Network Issues: Falls back to local Tesseract processing

The timestamp-based filename format (matricula_YYYYMMDD_HHMMSS.jpg) prevents naming conflicts and provides chronological organization.

Get Started

Core Features

Configuration

User Guide

Overview

Detection Architecture

How It Works

Stage 1: GPT-4o Vision Analysis

Stage 2: Tesseract Fallback

Text Normalization Pipeline

Integration with Registration Flow

Configuration Requirements

Performance Characteristics

Error Handling

Build docs developers (and LLMs) love

Get Started

Core Features

Configuration

User Guide

​Overview

​Detection Architecture

​How It Works

​Stage 1: GPT-4o Vision Analysis

​Stage 2: Tesseract Fallback

​Text Normalization Pipeline

​Integration with Registration Flow

​Configuration Requirements

​Performance Characteristics

​Error Handling

Build docs developers (and LLMs) love

Overview

Detection Architecture

How It Works

Stage 1: GPT-4o Vision Analysis

Stage 2: Tesseract Fallback

Text Normalization Pipeline

Integration with Registration Flow

Configuration Requirements

Performance Characteristics

Error Handling