Overview
LLM Gateway Core’s extensible architecture makes it easy to add support for any LLM provider. Whether you’re integrating:
A commercial API (OpenAI, Anthropic, Cohere)
A self-hosted model server
An internal ML platform
A custom inference engine
This guide will walk you through the complete process.
Architecture Review
Every provider must:
Inherit from LLMProvider abstract base class
Implement the chat() async method
Define the name property
Return a standardized ChatResponse
Register with the Router
Step-by-Step Implementation
Create Provider Class
Create a new file in app/providers/ for your provider: from app.providers.base import LLMProvider
from app.api.v1.schemas import ChatRequest, ChatResponse, Usage
import uuid
class CustomProvider ( LLMProvider ):
@ property
def name ( self ) -> str :
return "custom"
async def chat ( self , request : ChatRequest) -> ChatResponse:
# Implementation goes here
pass
Implement chat() Method
Add your provider’s logic to handle chat requests: async def chat ( self , request : ChatRequest) -> ChatResponse:
# 1. Extract request parameters
messages = request.messages
model = request.model or "default-model"
temperature = request.temperature
max_tokens = request.max_tokens
# 2. Call your provider's API
# (implementation depends on your provider)
# 3. Return standardized response
return ChatResponse(
id = str (uuid.uuid4()),
provider = self .name,
content = "Generated response text" ,
usage = Usage(
prompt_tokens = 10 ,
completion_tokens = 20 ,
total_tokens = 30
)
)
Add Configuration
Add any required settings to app/core/config.py: class Settings ( BaseSettings ):
# Existing settings...
# Your provider's configuration
CUSTOM_API_KEY : str = ""
CUSTOM_API_BASE_URL : str = "https://api.example.com"
CUSTOM_DEFAULT_MODEL : str = "model-v1"
Register Provider
Add your provider to the router in app/core/router.py: from app.providers.custom import CustomProvider
class Router :
def __init__ ( self ):
self .providers = {
"gemini" : GeminiProvider(),
"ollama" : OllamaProvider(),
"custom" : CustomProvider(), # Add your provider
}
def route ( self , request ) -> List[LLMProvider]:
target = request.model or request.model_hint
# Add routing logic for your provider
if target == "custom" :
return [ self .providers[ "custom" ]]
# Existing routing logic...
Complete Examples
Example 1: OpenAI Provider
Here’s a complete implementation for OpenAI’s API:
import httpx
from app.providers.base import LLMProvider
from app.api.v1.schemas import ChatRequest, ChatResponse, Usage
from app.core.config import settings
import uuid
class OpenAIProvider ( LLMProvider ):
@ property
def name ( self ) -> str :
return "openai"
async def chat ( self , request : ChatRequest) -> ChatResponse:
"""
Execute chat completion via OpenAI API.
"""
url = "https://api.openai.com/v1/chat/completions"
# Convert messages to OpenAI format
openai_messages = [
{ "role" : msg.role, "content" : msg.content}
for msg in request.messages
]
# Build request payload
payload = {
"model" : request.model or "gpt-4" ,
"messages" : openai_messages,
"temperature" : request.temperature,
"max_tokens" : request.max_tokens,
"stream" : False
}
# Make API request
headers = {
"Authorization" : f "Bearer { settings. OPENAI_API_KEY } " ,
"Content-Type" : "application/json"
}
async with httpx.AsyncClient( timeout = settings. PROVIDER_TIMEOUT_SECONDS ) as client:
try :
response = await client.post(url, json = payload, headers = headers)
response.raise_for_status()
data = response.json()
# Extract response
choice = data[ "choices" ][ 0 ]
usage = data[ "usage" ]
return ChatResponse(
id = data.get( "id" , str (uuid.uuid4())),
provider = self .name,
content = choice[ "message" ][ "content" ],
usage = Usage(
prompt_tokens = usage[ "prompt_tokens" ],
completion_tokens = usage[ "completion_tokens" ],
total_tokens = usage[ "total_tokens" ]
)
)
except httpx.HTTPStatusError as e:
print ( f "[OpenAI HTTP Error] { e.response.status_code } : { e.response.text } " )
raise
except Exception as e:
print ( f "[OpenAI Error] { e } " )
raise
Example 2: Anthropic Claude Provider
app/providers/anthropic.py
import httpx
from app.providers.base import LLMProvider
from app.api.v1.schemas import ChatRequest, ChatResponse, Usage, Message
from app.core.config import settings
import uuid
class AnthropicProvider ( LLMProvider ):
@ property
def name ( self ) -> str :
return "anthropic"
async def chat ( self , request : ChatRequest) -> ChatResponse:
"""
Execute chat completion via Anthropic API.
"""
url = "https://api.anthropic.com/v1/messages"
# Anthropic requires system messages separately
system_messages = [msg.content for msg in request.messages if msg.role == "system" ]
conversation_messages = [
{ "role" : msg.role, "content" : msg.content}
for msg in request.messages
if msg.role != "system"
]
payload = {
"model" : request.model or "claude-3-5-sonnet-20241022" ,
"max_tokens" : request.max_tokens or 1024 ,
"messages" : conversation_messages,
}
# Add system message if present
if system_messages:
payload[ "system" ] = " " .join(system_messages)
headers = {
"x-api-key" : settings. ANTHROPIC_API_KEY ,
"anthropic-version" : "2023-06-01" ,
"content-type" : "application/json"
}
async with httpx.AsyncClient( timeout = settings. PROVIDER_TIMEOUT_SECONDS ) as client:
try :
response = await client.post(url, json = payload, headers = headers)
response.raise_for_status()
data = response.json()
return ChatResponse(
id = data.get( "id" , str (uuid.uuid4())),
provider = self .name,
content = data[ "content" ][ 0 ][ "text" ],
usage = Usage(
prompt_tokens = data[ "usage" ][ "input_tokens" ],
completion_tokens = data[ "usage" ][ "output_tokens" ],
total_tokens = data[ "usage" ][ "input_tokens" ] + data[ "usage" ][ "output_tokens" ]
)
)
except Exception as e:
print ( f "[Anthropic Error] { e } " )
raise
Example 3: Mock Provider for Testing
import asyncio
from app.providers.base import LLMProvider
from app.api.v1.schemas import ChatRequest, ChatResponse, Usage
import uuid
class MockProvider ( LLMProvider ):
"""
Mock provider for testing without external API calls.
"""
@ property
def name ( self ) -> str :
return "mock"
async def chat ( self , request : ChatRequest) -> ChatResponse:
# Simulate API latency
await asyncio.sleep( 0.1 )
# Generate a mock response
last_message = request.messages[ - 1 ].content
mock_content = f "Mock response to: { last_message[: 50 ] } ..."
return ChatResponse(
id = str (uuid.uuid4()),
provider = self .name,
content = mock_content,
usage = Usage(
prompt_tokens = len (last_message.split()),
completion_tokens = len (mock_content.split()),
total_tokens = len (last_message.split()) + len (mock_content.split())
)
)
Different providers expect different message formats. Here are common patterns:
Error Handling Best Practices
Basic Error Handling
Detailed Error Handling
Retry Logic
async def chat ( self , request : ChatRequest) -> ChatResponse:
try :
# API call
response = await self .call_api(request)
return self .parse_response(response)
except Exception as e:
print ( f "[ { self .name.upper() } ERROR] { e } " )
raise
Testing Your Provider
Unit Tests
Create tests for your provider:
tests/test_providers/test_custom.py
import pytest
from app.providers.custom import CustomProvider
from app.api.v1.schemas import ChatRequest, Message
@pytest.mark.asyncio
async def test_custom_provider_chat ():
provider = CustomProvider()
request = ChatRequest(
messages = [
Message( role = "user" , content = "Hello, world!" )
],
model = "default"
)
response = await provider.chat(request)
assert response.provider == "custom"
assert response.content is not None
assert response.usage.total_tokens > 0
@pytest.mark.asyncio
async def test_custom_provider_error_handling ():
provider = CustomProvider()
# Test with invalid request
request = ChatRequest( messages = [])
with pytest.raises( Exception ):
await provider.chat(request)
Integration Tests
tests/integration/test_custom_provider.py
import pytest
from httpx import AsyncClient
from app.main import app
@pytest.mark.asyncio
async def test_custom_provider_via_api ():
async with AsyncClient( app = app, base_url = "http://test" ) as client:
response = await client.post(
"/v1/chat" ,
headers = { "Authorization" : "Bearer sk-gateway-123" },
json = {
"model" : "custom" ,
"messages" : [
{ "role" : "user" , "content" : "Test message" }
]
}
)
assert response.status_code == 200
data = response.json()
assert data[ "provider" ] == "custom"
Configuration Checklist
Environment Variables
Add all required settings to .env: CUSTOM_API_KEY = your-api-key
CUSTOM_BASE_URL = https://api.example.com
Settings Class
Update app/core/config.py with typed settings: class Settings ( BaseSettings ):
CUSTOM_API_KEY : str = ""
CUSTOM_BASE_URL : str = "https://api.example.com"
Provider Registration
Add to Router in app/core/router.py
Routing Logic
Define when your provider should be used
Documentation
Document your provider’s configuration and usage
Always validate that required configuration is present before making API calls. Fail fast with clear error messages.
Advanced Features
Streaming Support
If your provider supports streaming:
async def chat_stream ( self , request : ChatRequest):
"""
Stream chat responses token by token.
"""
# Implementation depends on your provider's streaming API
async for chunk in self .stream_api_call(request):
yield chunk
Model Introspection
Provide a method to list available models:
async def list_models ( self ) -> List[ str ]:
"""
Return a list of available models.
"""
return [ "model-v1" , "model-v2" , "model-v3" ]
Custom Parameters
Extend ChatRequest if you need custom parameters:
class ChatRequest ( BaseModel ):
# Standard fields...
# Custom provider-specific parameters
custom_param: Optional[ str ] = None
top_k: Optional[ int ] = None
Troubleshooting
Provider not being called
Check that provider is registered in Router.providers
Verify routing logic includes your provider
Test with explicit model name in request
Ensure all dependencies are in requirements.txt
Run pip install -r requirements.txt
Check Python import paths
Verify environment variables are loaded
Check API key format and validity
Review provider’s authentication documentation
Log the raw API response for debugging
Validate response structure matches expectations
Handle missing or null fields gracefully
Next Steps
Provider Overview Review provider architecture
Router Configuration Configure intelligent routing
Testing Guide Write tests for your provider
Deployment Deploy your custom provider