Skip to main content

Overview

The LLM Gateway API uses standard HTTP status codes to indicate the success or failure of requests. Error responses include a detail field with a human-readable message explaining the issue.

Error Response Schema

All error responses follow this structure:
detail
string
required
A human-readable message describing the error.

HTTP Status Codes

401 Unauthorized

Returned when the API key is invalid, missing, or not authorized to access the resource.
{
  "detail": "Invalid or missing API Key"
}
Always include your API key in the X-API-Key header of your requests. Keep your API keys secure and never expose them in client-side code.

429 Too Many Requests

Returned when you exceed the rate limit for your API key or IP address.
{
  "detail": "Too many requests. Please wait before trying again."
}
When you receive a 429 error, implement exponential backoff in your retry logic. Wait a few seconds before retrying, and increase the wait time with each subsequent failure.

500 Internal Server Error

Returned when an unexpected error occurs on the server side.
{
  "detail": "Internal server error occurred while processing your request"
}
If you consistently receive 500 errors, check the API status page or contact support. These errors are typically temporary and resolved automatically.

Error Handling Best Practices

Retry Logic

Implement proper retry logic with exponential backoff:
import requests
import time
from typing import Optional

def chat_with_retry(
    messages: list,
    api_key: str,
    max_retries: int = 3
) -> Optional[dict]:
    url = "https://api.example.com/v1/chat"
    headers = {"X-API-Key": api_key}
    payload = {"messages": messages}
    
    for attempt in range(max_retries):
        try:
            response = requests.post(url, headers=headers, json=payload)
            
            if response.status_code == 200:
                return response.json()
            
            elif response.status_code == 429:
                wait_time = 2 ** attempt  # Exponential backoff
                print(f"Rate limited. Waiting {wait_time}s...")
                time.sleep(wait_time)
            
            elif response.status_code == 401:
                print("Authentication failed. Check your API key.")
                return None
            
            elif response.status_code == 500:
                wait_time = 2 ** attempt
                print(f"Server error. Retrying in {wait_time}s...")
                time.sleep(wait_time)
            
            else:
                print(f"Unexpected error: {response.status_code}")
                return None
                
        except requests.RequestException as e:
            print(f"Request failed: {e}")
            time.sleep(2 ** attempt)
    
    print("Max retries exceeded")
    return None

Error Response Validation

Always validate error responses before processing:
Python
response = requests.post(url, headers=headers, json=payload)

if not response.ok:
    error_data = response.json()
    error_message = error_data.get("detail", "Unknown error")
    
    # Log error for monitoring
    logger.error(f"API Error {response.status_code}: {error_message}")
    
    # Handle specific error types
    if response.status_code == 401:
        raise AuthenticationError(error_message)
    elif response.status_code == 429:
        raise RateLimitError(error_message)
    elif response.status_code >= 500:
        raise ServerError(error_message)

Error Response From Source

Based on the source code in chat.py, here are the specific error conditions:

Authentication Check (Line 28-32)

Python
if api_key not in valid_keys:
    raise HTTPException(
        status_code=401,
        detail="Invalid or missing API Key"
    )

Rate Limit Check (Line 35-40)

Python
if not rate_limiter.allow(key):
    RATE_LIMIT_BLOCKED.inc()
    raise HTTPException(
        status_code=429, 
        detail="Too many requests. Please wait before trying again."
    )

Summary

Status CodeError TypeRetry?Common Fix
401UnauthorizedNoCheck API key in X-API-Key header
429Rate LimitedYesImplement exponential backoff
500Server ErrorYesWait and retry, contact support if persistent

Build docs developers (and LLMs) love