Overview
The LLM Gateway API uses standard HTTP status codes to indicate the success or failure of requests. Error responses include a detail field with a human-readable message explaining the issue.
Error Response Schema
All error responses follow this structure:
A human-readable message describing the error.
HTTP Status Codes
401 Unauthorized
Returned when the API key is invalid, missing, or not authorized to access the resource.
{
"detail" : "Invalid or missing API Key"
}
Missing X-API-Key header in the request
API key is invalid or has been revoked
API key does not have permission for the requested resource
Typo in the API key value
Always include your API key in the X-API-Key header of your requests. Keep your API keys secure and never expose them in client-side code.
429 Too Many Requests
Returned when you exceed the rate limit for your API key or IP address.
{
"detail" : "Too many requests. Please wait before trying again."
}
Show Rate Limiting Details
The LLM Gateway implements a token bucket rate limiting algorithm:
Capacity : Maximum number of requests in the bucket
Refill Rate : Rate at which request tokens are added back
Client Identification : Based on API key or IP address
Rate limits are enforced per client (API key or IP) and are backed by Redis for distributed rate limiting.
When you receive a 429 error, implement exponential backoff in your retry logic. Wait a few seconds before retrying, and increase the wait time with each subsequent failure.
500 Internal Server Error
Returned when an unexpected error occurs on the server side.
Response
Alternative Format
{
"detail" : "Internal server error occurred while processing your request"
}
Downstream LLM provider is unavailable
Database or Redis connection failure
Unexpected exception in request processing
Service configuration error
Resource exhaustion (memory, connections)
If you consistently receive 500 errors, check the API status page or contact support. These errors are typically temporary and resolved automatically.
Error Handling Best Practices
Retry Logic
Implement proper retry logic with exponential backoff:
import requests
import time
from typing import Optional
def chat_with_retry (
messages : list ,
api_key : str ,
max_retries : int = 3
) -> Optional[ dict ]:
url = "https://api.example.com/v1/chat"
headers = { "X-API-Key" : api_key}
payload = { "messages" : messages}
for attempt in range (max_retries):
try :
response = requests.post(url, headers = headers, json = payload)
if response.status_code == 200 :
return response.json()
elif response.status_code == 429 :
wait_time = 2 ** attempt # Exponential backoff
print ( f "Rate limited. Waiting { wait_time } s..." )
time.sleep(wait_time)
elif response.status_code == 401 :
print ( "Authentication failed. Check your API key." )
return None
elif response.status_code == 500 :
wait_time = 2 ** attempt
print ( f "Server error. Retrying in { wait_time } s..." )
time.sleep(wait_time)
else :
print ( f "Unexpected error: { response.status_code } " )
return None
except requests.RequestException as e:
print ( f "Request failed: { e } " )
time.sleep( 2 ** attempt)
print ( "Max retries exceeded" )
return None
Error Response Validation
Always validate error responses before processing:
response = requests.post(url, headers = headers, json = payload)
if not response.ok:
error_data = response.json()
error_message = error_data.get( "detail" , "Unknown error" )
# Log error for monitoring
logger.error( f "API Error { response.status_code } : { error_message } " )
# Handle specific error types
if response.status_code == 401 :
raise AuthenticationError(error_message)
elif response.status_code == 429 :
raise RateLimitError(error_message)
elif response.status_code >= 500 :
raise ServerError(error_message)
Error Response From Source
Based on the source code in chat.py, here are the specific error conditions:
Authentication Check (Line 28-32)
if api_key not in valid_keys:
raise HTTPException(
status_code = 401 ,
detail = "Invalid or missing API Key"
)
Rate Limit Check (Line 35-40)
if not rate_limiter.allow(key):
RATE_LIMIT_BLOCKED .inc()
raise HTTPException(
status_code = 429 ,
detail = "Too many requests. Please wait before trying again."
)
Summary
Status Code Error Type Retry? Common Fix 401 Unauthorized No Check API key in X-API-Key header 429 Rate Limited Yes Implement exponential backoff 500 Server Error Yes Wait and retry, contact support if persistent