OpenAI Setup

Overview

CoroNet uses OpenAI’s GPT-4o-mini vision model as the primary OCR engine for license plate extraction. This provides superior accuracy compared to traditional OCR methods, especially for images with varying angles, lighting conditions, or partial obstructions.

The OpenAI vision model is implemented in app.py:44-80 and uses base64 encoding to send images to the API for text extraction.

Why OpenAI GPT-4o-mini?

The application uses GPT-4o-mini for several advantages:

High accuracy: Advanced vision capabilities understand context and handle distortions
Robust recognition: Works with challenging lighting, angles, and image quality
Natural language processing: Can extract license plates from complex scenes
Fast processing: Mini variant balances speed and accuracy
Cost-effective: Lower pricing than full GPT-4o model

Getting Your API Key

Create an OpenAI account

Visit OpenAI Platform and sign up for an account:

Go to https://platform.openai.com/signup
Sign up with email or Google/Microsoft account
Verify your email address
Complete account setup

Add billing information

OpenAI API requires a paid account. Free trial credits may be available for new accounts.

Navigate to Billing
Click “Add payment method”
Enter your credit card information
Set up billing limits (recommended: start with $10-20)

Generate API key

Go to API Keys
Click “Create new secret key”
Give your key a name (e.g., “CoroNet Development”)
Important: Copy the key immediately - you won’t be able to see it again
Store the key securely

Add key to .env file

Add your API key to the .env file in your project root:

.env

OPENAI_API_KEY=sk-proj-AbCdEfGhIjKlMnOpQrStUvWxYz1234567890

Never commit your API key to version control. The .gitignore file excludes .env files by default.

Model Configuration

CoroNet uses the gpt-4o-mini model for vision tasks. This model is specified in app.py:50:

app.py

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": (
                        "Extrae únicamente el texto de la matrícula visible "
                        "en esta imagen (placa de vehículo). Devuelve solo la matrícula, "
                        "sin texto adicional, comentarios ni símbolos extras."
                    ),
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{img_base64}"
                    },
                },
            ],
        }
    ],
)

Model Options

You can modify the model based on your needs:

# Best balance of speed and cost
# Recommended for most use cases
model="gpt-4o-mini"

To change the model, edit app.py:50 and replace "gpt-4o-mini" with your preferred model.

How the Vision API Works

The implementation in CoroNet follows this workflow:

Image encoding

The uploaded image is read and converted to base64:

with open(image_path, "rb") as f:
    img_base64 = base64.b64encode(f.read()).decode("utf-8")

API request

The base64 image is sent to OpenAI with a prompt in Spanish:

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Extrae únicamente el texto de la matrícula..."},
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{img_base64}"}}
        ]
    }]
)

Response processing

The API response is cleaned and normalized:

text = response.choices[0].message.content.strip().upper()
clean = "".join([c for c in text if c.isalnum() or c == "-"])[:10]
return clean or "NO_DETECTADA"

Fallback mechanism

If OpenAI returns “NO_DETECTADA”, Tesseract OCR is used as fallback (app.py:106-110).

Customizing the Prompt

The current prompt is in Spanish and instructs the model to extract only the license plate text:

"Extrae únicamente el texto de la matrícula visible "
"en esta imagen (placa de vehículo). Devuelve solo la matrícula, "
"sin texto adicional, comentarios ni símbolos extras."

English Prompt Alternative

You can modify the prompt to English:

app.py

{
    "type": "text",
    "text": (
        "Extract only the license plate text visible in this image. "
        "Return only the plate number without any additional text, "
        "comments, or extra symbols."
    ),
}

Enhanced Prompt for Better Accuracy

For improved results, try this enhanced prompt:

{
    "type": "text",
    "text": (
        "Analyze this image and extract the vehicle license plate number. "
        "Requirements:\n"
        "- Return ONLY the alphanumeric characters from the plate\n"
        "- Include hyphens if present\n"
        "- Remove any spaces\n"
        "- Convert to uppercase\n"
        "- If no plate is visible, respond with 'NO_DETECTADA'\n"
        "Example format: ABC1234 or ABC-1234"
    ),
}

API Usage and Costs

Pricing (as of 2026)

GPT-4o-mini pricing:

Input: $0.150 per 1M tokens
Output: $0.600 per 1M tokens

Vision requests with images cost more due to image tokens. A typical license plate image (1024x768) uses approximately 765 tokens.

Cost Estimation

For a typical license plate extraction:

Image tokens: ~765 tokens
Prompt tokens: ~50 tokens
Response tokens: ~10 tokens
Total per request: ~825 tokens
Cost per request: ~$0.001 (less than a penny)

Example monthly costs:

100 scans/month: ~$0.10
1,000 scans/month: ~$1.00
10,000 scans/month: ~$10.00

Setting Usage Limits

Protect your account from unexpected charges:

Set hard limits

Go to Billing Settings
Set “Hard limit” (e.g., $20/month)
Set “Soft limit” for notifications (e.g., $15/month)

Enable email notifications

Receive alerts when approaching limits:

75% of soft limit reached
90% of soft limit reached
Hard limit reached (API access disabled)

Monitor usage

Check your usage regularly:

Usage Dashboard
View daily/monthly breakdowns
Track costs by model

Error Handling

The implementation includes error handling in app.py:78-80:

except Exception as e:
    print(" Error OCR con OpenAI:", e)
    return "NO_DETECTADA"

Common errors and solutions:

Authentication Error

openai.AuthenticationError: Incorrect API key provided

Solution: Verify your API key in .env file:

cat .env | grep OPENAI_API_KEY

Rate Limit Error

openai.RateLimitError: Rate limit exceeded

Solution: Implement rate limiting or upgrade your plan:

import time
from openai import RateLimitError

try:
    response = client.chat.completions.create(...)
except RateLimitError:
    time.sleep(5)  # Wait and retry
    response = client.chat.completions.create(...)

Insufficient Quota

openai.InsufficientQuotaError: You exceeded your current quota

Solution: Add funds to your OpenAI account billing.

Testing the Configuration

Verify your OpenAI setup with this test script:

import os
import base64
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

def test_openai_config():
    # Check API key
    api_key = os.getenv("OPENAI_API_KEY")
    if not api_key:
        print("✗ OPENAI_API_KEY not found in .env file")
        return
    
    print(f"✓ API key found: {api_key[:10]}...")
    
    # Initialize client
    try:
        client = OpenAI(api_key=api_key)
        print("✓ OpenAI client initialized")
    except Exception as e:
        print(f"✗ Error initializing client: {e}")
        return
    
    # Test simple completion
    try:
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": "Say 'API works'"}]
        )
        print(f"✓ API test successful: {response.choices[0].message.content}")
    except Exception as e:
        print(f"✗ API test failed: {e}")

if __name__ == "__main__":
    test_openai_config()

Expected output:

✓ API key found: sk-proj-Ab...
✓ OpenAI client initialized
✓ API test successful: API works

Best Practices

Security

Never commit API keys to git
Rotate keys periodically
Use separate keys for dev/prod
Monitor usage for anomalies

Cost Optimization

Set hard billing limits
Use gpt-4o-mini (not gpt-4o)
Implement request caching
Monitor API usage regularly

Error Handling

Catch API exceptions
Implement retry logic
Use Tesseract fallback
Log errors for debugging

Performance

Compress images before upload
Use async requests if possible
Cache common results
Implement request queuing

Advanced Configuration

Image Preprocessing

Improve accuracy by preprocessing images before sending to the API:

from PIL import Image, ImageEnhance
import io

def preprocess_image(image_path):
    # Open image
    img = Image.open(image_path)
    
    # Resize if too large (reduce tokens)
    max_size = (1024, 1024)
    img.thumbnail(max_size, Image.Resampling.LANCZOS)
    
    # Enhance contrast
    enhancer = ImageEnhance.Contrast(img)
    img = enhancer.enhance(1.5)
    
    # Convert to JPEG in memory
    buffer = io.BytesIO()
    img.save(buffer, format='JPEG', quality=85)
    buffer.seek(0)
    
    return base64.b64encode(buffer.read()).decode('utf-8')

Batch Processing

Process multiple images efficiently:

from concurrent.futures import ThreadPoolExecutor

def batch_extract_plates(image_paths, max_workers=5):
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        results = executor.map(extract_plate_from_image, image_paths)
    return list(results)

# Usage
image_files = ['img1.jpg', 'img2.jpg', 'img3.jpg']
plates = batch_extract_plates(image_files)

Get Started

Core Features

Configuration

User Guide

Overview

Why OpenAI GPT-4o-mini?

Getting Your API Key

Model Configuration

Model Options

How the Vision API Works

Customizing the Prompt

English Prompt Alternative

Enhanced Prompt for Better Accuracy

API Usage and Costs

Pricing (as of 2026)

Cost Estimation

Setting Usage Limits

Error Handling

Authentication Error

Rate Limit Error

Insufficient Quota

Testing the Configuration

Best Practices

Security

Cost Optimization

Error Handling

Performance

Advanced Configuration

Image Preprocessing

Batch Processing

Next Steps

Environment Variables

Tesseract Setup

Build docs developers (and LLMs) love

Get Started

Core Features

Configuration

User Guide

​Overview

​Why OpenAI GPT-4o-mini?

​Getting Your API Key

​Model Configuration

​Model Options

​How the Vision API Works

​Customizing the Prompt

​English Prompt Alternative

​Enhanced Prompt for Better Accuracy

​API Usage and Costs

​Pricing (as of 2026)

​Cost Estimation

​Setting Usage Limits

​Error Handling

​Authentication Error

​Rate Limit Error

​Insufficient Quota

​Testing the Configuration

​Best Practices

Security

Cost Optimization

Error Handling

Performance

​Advanced Configuration

​Image Preprocessing

​Batch Processing

​Next Steps

Environment Variables

Tesseract Setup

Build docs developers (and LLMs) love

Overview

Why OpenAI GPT-4o-mini?

Getting Your API Key

Model Configuration

Model Options

How the Vision API Works

Customizing the Prompt

English Prompt Alternative

Enhanced Prompt for Better Accuracy

API Usage and Costs

Pricing (as of 2026)

Cost Estimation

Setting Usage Limits

Error Handling

Authentication Error

Rate Limit Error

Insufficient Quota

Testing the Configuration

Best Practices

Advanced Configuration

Image Preprocessing

Batch Processing

Next Steps