Skip to main content

Overview

CoroNet uses OpenAI’s GPT-4o-mini vision model as the primary OCR engine for license plate extraction. This provides superior accuracy compared to traditional OCR methods, especially for images with varying angles, lighting conditions, or partial obstructions.
The OpenAI vision model is implemented in app.py:44-80 and uses base64 encoding to send images to the API for text extraction.

Why OpenAI GPT-4o-mini?

The application uses GPT-4o-mini for several advantages:
  • High accuracy: Advanced vision capabilities understand context and handle distortions
  • Robust recognition: Works with challenging lighting, angles, and image quality
  • Natural language processing: Can extract license plates from complex scenes
  • Fast processing: Mini variant balances speed and accuracy
  • Cost-effective: Lower pricing than full GPT-4o model

Getting Your API Key

1

Create an OpenAI account

Visit OpenAI Platform and sign up for an account:
  1. Go to https://platform.openai.com/signup
  2. Sign up with email or Google/Microsoft account
  3. Verify your email address
  4. Complete account setup
2

Add billing information

OpenAI API requires a paid account. Free trial credits may be available for new accounts.
  1. Navigate to Billing
  2. Click “Add payment method”
  3. Enter your credit card information
  4. Set up billing limits (recommended: start with $10-20)
3

Generate API key

  1. Go to API Keys
  2. Click “Create new secret key”
  3. Give your key a name (e.g., “CoroNet Development”)
  4. Important: Copy the key immediately - you won’t be able to see it again
  5. Store the key securely
4

Add key to .env file

Add your API key to the .env file in your project root:
.env
OPENAI_API_KEY=sk-proj-AbCdEfGhIjKlMnOpQrStUvWxYz1234567890
Never commit your API key to version control. The .gitignore file excludes .env files by default.

Model Configuration

CoroNet uses the gpt-4o-mini model for vision tasks. This model is specified in app.py:50:
app.py
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": (
                        "Extrae únicamente el texto de la matrícula visible "
                        "en esta imagen (placa de vehículo). Devuelve solo la matrícula, "
                        "sin texto adicional, comentarios ni símbolos extras."
                    ),
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{img_base64}"
                    },
                },
            ],
        }
    ],
)

Model Options

You can modify the model based on your needs:
# Best balance of speed and cost
# Recommended for most use cases
model="gpt-4o-mini"
To change the model, edit app.py:50 and replace "gpt-4o-mini" with your preferred model.

How the Vision API Works

The implementation in CoroNet follows this workflow:
1

Image encoding

The uploaded image is read and converted to base64:
with open(image_path, "rb") as f:
    img_base64 = base64.b64encode(f.read()).decode("utf-8")
2

API request

The base64 image is sent to OpenAI with a prompt in Spanish:
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Extrae únicamente el texto de la matrícula..."},
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{img_base64}"}}
        ]
    }]
)
3

Response processing

The API response is cleaned and normalized:
text = response.choices[0].message.content.strip().upper()
clean = "".join([c for c in text if c.isalnum() or c == "-"])[:10]
return clean or "NO_DETECTADA"
4

Fallback mechanism

If OpenAI returns “NO_DETECTADA”, Tesseract OCR is used as fallback (app.py:106-110).

Customizing the Prompt

The current prompt is in Spanish and instructs the model to extract only the license plate text:
"Extrae únicamente el texto de la matrícula visible "
"en esta imagen (placa de vehículo). Devuelve solo la matrícula, "
"sin texto adicional, comentarios ni símbolos extras."

English Prompt Alternative

You can modify the prompt to English:
app.py
{
    "type": "text",
    "text": (
        "Extract only the license plate text visible in this image. "
        "Return only the plate number without any additional text, "
        "comments, or extra symbols."
    ),
}

Enhanced Prompt for Better Accuracy

For improved results, try this enhanced prompt:
{
    "type": "text",
    "text": (
        "Analyze this image and extract the vehicle license plate number. "
        "Requirements:\n"
        "- Return ONLY the alphanumeric characters from the plate\n"
        "- Include hyphens if present\n"
        "- Remove any spaces\n"
        "- Convert to uppercase\n"
        "- If no plate is visible, respond with 'NO_DETECTADA'\n"
        "Example format: ABC1234 or ABC-1234"
    ),
}

API Usage and Costs

Pricing (as of 2026)

GPT-4o-mini pricing:
  • Input: $0.150 per 1M tokens
  • Output: $0.600 per 1M tokens
Vision requests with images cost more due to image tokens. A typical license plate image (1024x768) uses approximately 765 tokens.

Cost Estimation

For a typical license plate extraction:
  • Image tokens: ~765 tokens
  • Prompt tokens: ~50 tokens
  • Response tokens: ~10 tokens
  • Total per request: ~825 tokens
  • Cost per request: ~$0.001 (less than a penny)
Example monthly costs:
  • 100 scans/month: ~$0.10
  • 1,000 scans/month: ~$1.00
  • 10,000 scans/month: ~$10.00

Setting Usage Limits

Protect your account from unexpected charges:
1

Set hard limits

  1. Go to Billing Settings
  2. Set “Hard limit” (e.g., $20/month)
  3. Set “Soft limit” for notifications (e.g., $15/month)
2

Enable email notifications

Receive alerts when approaching limits:
  • 75% of soft limit reached
  • 90% of soft limit reached
  • Hard limit reached (API access disabled)
3

Monitor usage

Check your usage regularly:

Error Handling

The implementation includes error handling in app.py:78-80:
except Exception as e:
    print(" Error OCR con OpenAI:", e)
    return "NO_DETECTADA"
Common errors and solutions:

Authentication Error

openai.AuthenticationError: Incorrect API key provided
Solution: Verify your API key in .env file:
cat .env | grep OPENAI_API_KEY

Rate Limit Error

openai.RateLimitError: Rate limit exceeded
Solution: Implement rate limiting or upgrade your plan:
import time
from openai import RateLimitError

try:
    response = client.chat.completions.create(...)
except RateLimitError:
    time.sleep(5)  # Wait and retry
    response = client.chat.completions.create(...)

Insufficient Quota

openai.InsufficientQuotaError: You exceeded your current quota
Solution: Add funds to your OpenAI account billing.

Testing the Configuration

Verify your OpenAI setup with this test script:
import os
import base64
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

def test_openai_config():
    # Check API key
    api_key = os.getenv("OPENAI_API_KEY")
    if not api_key:
        print("✗ OPENAI_API_KEY not found in .env file")
        return
    
    print(f"✓ API key found: {api_key[:10]}...")
    
    # Initialize client
    try:
        client = OpenAI(api_key=api_key)
        print("✓ OpenAI client initialized")
    except Exception as e:
        print(f"✗ Error initializing client: {e}")
        return
    
    # Test simple completion
    try:
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": "Say 'API works'"}]
        )
        print(f"✓ API test successful: {response.choices[0].message.content}")
    except Exception as e:
        print(f"✗ API test failed: {e}")

if __name__ == "__main__":
    test_openai_config()
Expected output:
✓ API key found: sk-proj-Ab...
✓ OpenAI client initialized
✓ API test successful: API works

Best Practices

Security

  • Never commit API keys to git
  • Rotate keys periodically
  • Use separate keys for dev/prod
  • Monitor usage for anomalies

Cost Optimization

  • Set hard billing limits
  • Use gpt-4o-mini (not gpt-4o)
  • Implement request caching
  • Monitor API usage regularly

Error Handling

  • Catch API exceptions
  • Implement retry logic
  • Use Tesseract fallback
  • Log errors for debugging

Performance

  • Compress images before upload
  • Use async requests if possible
  • Cache common results
  • Implement request queuing

Advanced Configuration

Image Preprocessing

Improve accuracy by preprocessing images before sending to the API:
from PIL import Image, ImageEnhance
import io

def preprocess_image(image_path):
    # Open image
    img = Image.open(image_path)
    
    # Resize if too large (reduce tokens)
    max_size = (1024, 1024)
    img.thumbnail(max_size, Image.Resampling.LANCZOS)
    
    # Enhance contrast
    enhancer = ImageEnhance.Contrast(img)
    img = enhancer.enhance(1.5)
    
    # Convert to JPEG in memory
    buffer = io.BytesIO()
    img.save(buffer, format='JPEG', quality=85)
    buffer.seek(0)
    
    return base64.b64encode(buffer.read()).decode('utf-8')

Batch Processing

Process multiple images efficiently:
from concurrent.futures import ThreadPoolExecutor

def batch_extract_plates(image_paths, max_workers=5):
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        results = executor.map(extract_plate_from_image, image_paths)
    return list(results)

# Usage
image_files = ['img1.jpg', 'img2.jpg', 'img3.jpg']
plates = batch_extract_plates(image_files)

Next Steps

Environment Variables

Complete guide to all configuration options

Tesseract Setup

Install fallback OCR engine

Build docs developers (and LLMs) love