Skip to main content

Overview

CoroNet employs a sophisticated two-tier approach to license plate detection, combining the power of OpenAI’s GPT-4o vision model with a Tesseract OCR fallback system. This ensures maximum accuracy and reliability even when one system fails.

Detection Architecture

The plate detection system operates in two stages:
  1. Primary Detection: GPT-4o vision model analyzes the uploaded image
  2. Fallback Detection: If GPT-4o fails, Tesseract OCR processes the image
This dual-layer approach achieves high accuracy across various image qualities, lighting conditions, and plate formats.

How It Works

Stage 1: GPT-4o Vision Analysis

When a vehicle image is uploaded, CoroNet first attempts detection using OpenAI’s GPT-4o model:
def extract_plate_from_image(image_path):
    try:
        # Convert image to base64 for API transmission
        with open(image_path, "rb") as f:
            img_base64 = base64.b64encode(f.read()).decode("utf-8")

        # Send to GPT-4o with specific instructions
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "text",
                            "text": (
                                "Extrae únicamente el texto de la matrícula visible "
                                "en esta imagen (placa de vehículo). Devuelve solo la matrícula, "
                                "sin texto adicional, comentarios ni símbolos extras."
                            ),
                        },
                        {
                            "type": "image_url",
                            "image_url": {
                                "url": f"data:image/jpeg;base64,{img_base64}"
                            },
                        },
                    ],
                }
            ],
        )

        # Extract and clean the detected text
        text = response.choices[0].message.content.strip().upper()
        clean = "".join([c for c in text if c.isalnum() or c == "-"])[:10]
        return clean or "NO_DETECTADA"

    except Exception as e:
        print(" Error OCR con OpenAI:", e)
        return "NO_DETECTADA"
Key aspects of GPT-4o detection:
  • Image is encoded as base64 for API transmission (app.py:47)
  • Specific Spanish-language prompt instructs the model to extract only the plate text (app.py:58-60)
  • Response is normalized: uppercased, cleaned of non-alphanumeric characters (except hyphens), and limited to 10 characters (app.py:74-75)
  • Returns “NO_DETECTADA” on failure, triggering the fallback system (app.py:76)

Stage 2: Tesseract Fallback

If GPT-4o cannot detect a plate, the system automatically falls back to Tesseract OCR:
# Fallback with pytesseract
if matricula == "NO_DETECTADA":
    image = Image.open(path)
    ocr_text = pytesseract.image_to_string(image, lang="eng")
    ocr_text = ocr_text.strip().replace(" ", "").replace("\n", "").upper()
    matricula = "".join([c for c in ocr_text if c.isalnum() or c == "-"])[:10]
Fallback characteristics:
  • Only activates when GPT-4o returns “NO_DETECTADA” (app.py:106)
  • Uses PIL to load the image (app.py:107)
  • Applies English language model for better alphanumeric recognition (app.py:108)
  • Performs aggressive text cleaning: removes spaces, newlines, and normalizes to uppercase (app.py:109-110)
The fallback system ensures that even if the primary AI model fails (due to API issues, complex images, or edge cases), the system still attempts detection using traditional OCR.

Text Normalization Pipeline

Both detection methods apply consistent text normalization:
  1. Case Conversion: All text is converted to uppercase for consistency
  2. Whitespace Removal: Spaces and newlines are stripped
  3. Character Filtering: Only alphanumeric characters and hyphens are retained
  4. Length Limiting: Results are truncated to 10 characters maximum
  5. Empty Check: Returns “NO_DETECTADA” if no valid characters remain
This ensures that regardless of which detection method succeeds, the output format is consistent and database-ready.

Integration with Registration Flow

The detection process is seamlessly integrated into the registration workflow:
@app.route("/guardar", methods=["POST"])
def guardar():
    # ... file upload handling ...
    
    # Save uploaded image
    filename = f"matricula_{datetime.now().strftime('%Y%m%d_%H%M%S')}.jpg"
    path = os.path.join(UPLOADS_DIR, filename)
    file.save(path)

    # Attempt GPT-4o detection
    matricula = extract_plate_from_image(path)

    # Fallback to Tesseract if needed
    if matricula == "NO_DETECTADA":
        image = Image.open(path)
        ocr_text = pytesseract.image_to_string(image, lang="eng")
        ocr_text = ocr_text.strip().replace(" ", "").replace("\n", "").upper()
        matricula = "".join([c for c in ocr_text if c.isalnum() or c == "-"])[:10]

    # Store in CSV with detected plate
    rows.append({
        "id": new_id,
        "fecha_hora": fecha,
        "matricula": matricula,
        # ... other fields ...
    })

Configuration Requirements

The OpenAI API key must be configured in your .env file:
OPENAI_API_KEY=sk-your-key-here
The application loads this at startup (app.py:14-15):
load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

Performance Characteristics

  • GPT-4o Detection: Typically 1-3 seconds depending on image size and API latency
  • Tesseract Fallback: Usually sub-second processing time
  • Image Format: Saves as JPEG with timestamp-based naming (app.py:98)
  • Storage: Images stored in uploads/ directory for later reference (app.py:19)

Error Handling

The system gracefully handles various failure scenarios:
  • API Failures: Catches OpenAI API exceptions and triggers Tesseract fallback
  • Missing Images: Validates file upload before processing (app.py:94-96)
  • Unreadable Plates: Returns “NO_DETECTADA” when both systems fail
  • Network Issues: Falls back to local Tesseract processing
The timestamp-based filename format (matricula_YYYYMMDD_HHMMSS.jpg) prevents naming conflicts and provides chronological organization.

Build docs developers (and LLMs) love