Why Migrate to LLM Gateway?
Migrating from direct provider APIs to LLM Gateway provides:
Unified Interface One API for 15+ providers instead of managing multiple SDKs.
Automatic Failover Built-in redundancy across providers ensures high availability.
Cost Optimization Intelligent routing selects the best provider based on cost and performance.
Usage Analytics Centralized logging and analytics across all providers.
Response Caching Built-in caching reduces costs and improves response times.
Easy Switching Switch between providers instantly without code changes.
Migration Overview
Migration typically involves:
Sign up for LLM Gateway
Get an API key from your project
Update endpoints in your code
Optional : Add your provider API keys for direct billing
Test thoroughly before going to production
Migration is usually a 5-10 minute process with minimal code changes.
OpenAI SDK Migration
Before (Direct OpenAI)
from openai import OpenAI
client = OpenAI(
api_key = "sk-...your-openai-key..."
)
response = client.chat.completions.create(
model = "gpt-4o" ,
messages = [
{ "role" : "user" , "content" : "Hello!" }
]
)
print (response.choices[ 0 ].message.content)
After (LLM Gateway)
from openai import OpenAI
client = OpenAI(
api_key = "llmg_...your-gateway-key..." ,
base_url = "https://api.llmgateway.io/v1"
)
response = client.chat.completions.create(
model = "gpt-4o" , # Or any other model
messages = [
{ "role" : "user" , "content" : "Hello!" }
]
)
print (response.choices[ 0 ].message.content)
Changes:
Replace api_key with your LLM Gateway API key
Add base_url="https://api.llmgateway.io/v1"
Optionally change model to use other providers
Using Your OpenAI API Key
To use your own OpenAI API key (no gateway markup):
Add Provider Key
Go to Organization Settings → Provider Keys and add your OpenAI API key.
Set Project Mode
Configure your project to use hybrid or api-keys mode.
Make Requests
Requests will automatically use your key when available.
Anthropic SDK Migration
Before (Direct Anthropic)
import anthropic
client = anthropic.Anthropic(
api_key = "sk-ant-...your-anthropic-key..."
)
message = client.messages.create(
model = "claude-3-5-sonnet-20241022" ,
max_tokens = 1024 ,
messages = [
{ "role" : "user" , "content" : "Hello, Claude!" }
]
)
print (message.content[ 0 ].text)
After (LLM Gateway - OpenAI SDK)
LLM Gateway uses the OpenAI-compatible format:
from openai import OpenAI
client = OpenAI(
api_key = "llmg_...your-gateway-key..." ,
base_url = "https://api.llmgateway.io/v1"
)
response = client.chat.completions.create(
model = "claude-3-5-sonnet-20241022" ,
max_tokens = 1024 ,
messages = [
{ "role" : "user" , "content" : "Hello, Claude!" }
]
)
print (response.choices[ 0 ].message.content)
Changes:
Switch from Anthropic SDK to OpenAI SDK
Use LLM Gateway credentials and base URL
Response format changes from Anthropic to OpenAI structure
Using Your Anthropic API Key
Add Provider Key
Add your Anthropic API key in Organization Settings → Provider Keys .
Configure Mode
Set project mode to hybrid or api-keys.
Make Requests
Claude requests will use your key automatically.
Google Gemini Migration
Before (Direct Google)
import google.generativeai as genai
genai.configure( api_key = "AIza...your-google-key..." )
model = genai.GenerativeModel( 'gemini-2.0-flash-exp' )
response = model.generate_content( "Write a story about a magic backpack." )
print (response.text)
After (LLM Gateway)
from openai import OpenAI
client = OpenAI(
api_key = "llmg_...your-gateway-key..." ,
base_url = "https://api.llmgateway.io/v1"
)
response = client.chat.completions.create(
model = "gemini-2.0-flash-exp" ,
messages = [
{ "role" : "user" , "content" : "Write a story about a magic backpack." }
]
)
print (response.choices[ 0 ].message.content)
Changes:
Switch from Google SDK to OpenAI SDK
Use chat format instead of generate_content
Access response via .choices[0].message.content
JavaScript/TypeScript Migration
OpenAI (Node.js)
Before:
import OpenAI from "openai" ;
const openai = new OpenAI ({
apiKey: process . env . OPENAI_API_KEY ,
});
const completion = await openai . chat . completions . create ({
model: "gpt-4o" ,
messages: [{ role: "user" , content: "Hello!" }],
});
console . log ( completion . choices [ 0 ]. message . content );
After:
import OpenAI from "openai" ;
const openai = new OpenAI ({
apiKey: process . env . LLM_GATEWAY_API_KEY ,
baseURL: "https://api.llmgateway.io/v1" ,
});
const completion = await openai . chat . completions . create ({
model: "gpt-4o" , // Or any other model
messages: [{ role: "user" , content: "Hello!" }],
});
console . log ( completion . choices [ 0 ]. message . content );
Anthropic (Node.js)
Before:
import Anthropic from "@anthropic-ai/sdk" ;
const anthropic = new Anthropic ({
apiKey: process . env . ANTHROPIC_API_KEY ,
});
const message = await anthropic . messages . create ({
model: "claude-3-5-sonnet-20241022" ,
max_tokens: 1024 ,
messages: [{ role: "user" , content: "Hello!" }],
});
console . log ( message . content [ 0 ]. text );
After:
import OpenAI from "openai" ;
const openai = new OpenAI ({
apiKey: process . env . LLM_GATEWAY_API_KEY ,
baseURL: "https://api.llmgateway.io/v1" ,
});
const completion = await openai . chat . completions . create ({
model: "claude-3-5-sonnet-20241022" ,
max_tokens: 1024 ,
messages: [{ role: "user" , content: "Hello!" }],
});
console . log ( completion . choices [ 0 ]. message . content );
Streaming Responses
Streaming works the same way with LLM Gateway:
Python
from openai import OpenAI
client = OpenAI(
api_key = "llmg_...your-gateway-key..." ,
base_url = "https://api.llmgateway.io/v1"
)
stream = client.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Tell me a story" }],
stream = True
)
for chunk in stream:
if chunk.choices[ 0 ].delta.content:
print (chunk.choices[ 0 ].delta.content, end = "" )
JavaScript
const stream = await openai . chat . completions . create ({
model: "gpt-4o" ,
messages: [{ role: "user" , content: "Tell me a story" }],
stream: true ,
});
for await ( const chunk of stream ) {
const content = chunk . choices [ 0 ]?. delta ?. content ;
if ( content ) {
process . stdout . write ( content );
}
}
Vision and Multimodal
Send images alongside text:
from openai import OpenAI
import base64
client = OpenAI(
api_key = "llmg_...your-gateway-key..." ,
base_url = "https://api.llmgateway.io/v1"
)
# Read and encode image
with open ( "image.jpg" , "rb" ) as f:
image_data = base64.b64encode(f.read()).decode()
response = client.chat.completions.create(
model = "gpt-4o" , # Vision-capable model
messages = [
{
"role" : "user" ,
"content" : [
{ "type" : "text" , "text" : "What's in this image?" },
{
"type" : "image_url" ,
"image_url" : {
"url" : f "data:image/jpeg;base64, { image_data } "
}
}
]
}
]
)
print (response.choices[ 0 ].message.content)
Tools work identically to OpenAI:
tools = [
{
"type" : "function" ,
"function" : {
"name" : "get_weather" ,
"description" : "Get current weather for a location" ,
"parameters" : {
"type" : "object" ,
"properties" : {
"location" : {
"type" : "string" ,
"description" : "City name"
}
},
"required" : [ "location" ]
}
}
}
]
response = client.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "What's the weather in Paris?" }],
tools = tools,
tool_choice = "auto"
)
# Handle tool calls
if response.choices[ 0 ].message.tool_calls:
for tool_call in response.choices[ 0 ].message.tool_calls:
if tool_call.function.name == "get_weather" :
# Execute your function
result = get_weather(tool_call.function.arguments)
# Continue conversation with result
Environment Variables
Update your environment variables:
Before:
OPENAI_API_KEY = sk-...
ANTHROPIC_API_KEY = sk-ant-...
GOOGLE_API_KEY = AIza...
After:
LLM_GATEWAY_API_KEY = llmg_...
# Optional: Keep provider keys for direct billing
OPENAI_API_KEY = sk-...
ANTHROPIC_API_KEY = sk-ant-...
Use a .env file and load it with python-dotenv (Python) or dotenv (Node.js).
Framework Integration
LangChain (Python)
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model = "gpt-4o" ,
api_key = "llmg_...your-gateway-key..." ,
base_url = "https://api.llmgateway.io/v1"
)
response = llm.invoke( "What is LangChain?" )
print (response.content)
LangChain (TypeScript)
import { ChatOpenAI } from "@langchain/openai" ;
const llm = new ChatOpenAI ({
modelName: "gpt-4o" ,
openAIApiKey: process . env . LLM_GATEWAY_API_KEY ,
configuration: {
baseURL: "https://api.llmgateway.io/v1" ,
},
});
const response = await llm . invoke ( "What is LangChain?" );
console . log ( response . content );
LlamaIndex
from llama_index.llms.openai import OpenAI as LlamaOpenAI
llm = LlamaOpenAI(
model = "gpt-4o" ,
api_key = "llmg_...your-gateway-key..." ,
api_base = "https://api.llmgateway.io/v1"
)
response = llm.complete( "What is LlamaIndex?" )
print (response.text)
Vercel AI SDK
import { openai } from "@ai-sdk/openai" ;
import { streamText } from "ai" ;
const result = streamText ({
model: openai ( "gpt-4o" , {
apiKey: process . env . LLM_GATEWAY_API_KEY ,
baseURL: "https://api.llmgateway.io/v1" ,
}),
prompt: "Write a poem about programming" ,
});
for await ( const chunk of result . textStream ) {
process . stdout . write ( chunk );
}
Testing Your Migration
Set Up Test Environment
Create a test project in LLM Gateway and generate an API key.
Update Configuration
Change base_url and api_key in your test environment.
Run Test Suite
Execute your existing tests to verify compatibility.
Compare Responses
Check that responses match expected format and quality.
Monitor Performance
Measure latency and throughput to ensure acceptable performance.
Test Error Handling
Verify your error handling works with gateway error formats.
Common Migration Issues
Response Format Differences
Issue : Model names vary between providersSolution : Use LLM Gateway’s unified model IDs:
"gpt-4o" for OpenAI GPT-4o
"claude-3-5-sonnet-20241022" for Claude 3.5
"gemini-2.0-flash-exp" for Gemini 2.0
See list-models API for all IDs.
Issue : 401 Unauthorized errorsSolution :
Verify API key starts with llmg_
Check key is active in dashboard
Ensure key has not exceeded usage limit
Confirm base URL is correct
Issue : Requests timing outSolution :
Increase client timeout (default: 600s)
Use streaming for long responses
Check provider status page
Try a different model/provider
Issue : 429 Too Many Requests errorsSolution :
Implement exponential backoff
Respect Retry-After header
Contact support to increase limits
Distribute load across multiple keys
Rollback Plan
If you need to revert to direct provider APIs:
Keep Old Configuration
Maintain separate config for direct provider access during migration.
Use Feature Flags
Toggle between gateway and direct APIs with a feature flag.
Monitor Metrics
Track error rates, latency, and costs in both configurations.
Gradual Rollout
Route a percentage of traffic through gateway, increase gradually.
Example with feature flag:
import os
from openai import OpenAI
USE_GATEWAY = os.getenv( "USE_LLM_GATEWAY" , "false" ) == "true"
if USE_GATEWAY :
client = OpenAI(
api_key = os.getenv( "LLM_GATEWAY_API_KEY" ),
base_url = "https://api.llmgateway.io/v1"
)
else :
client = OpenAI(
api_key = os.getenv( "OPENAI_API_KEY" )
)
# Rest of your code remains the same
Cost Comparison
Pricing Models
Gateway Credits
Your Provider Keys
Hybrid Mode
Pay-as-you-go with gateway markup:
Simple per-token pricing
No provider account needed
Instant access to all models
~10-20% markup over provider pricing
Direct provider billing with no markup:
Use your existing provider accounts
Provider’s pricing (no gateway fee)
Requires Pro plan ($29/mo)
Configure keys in organization settings
Best of both worlds:
Use your keys when configured
Fall back to credits when needed
Uninterrupted service
Recommended for production
Sample Cost Analysis
Scenario : 10M input tokens + 1M output tokens with GPT-4o
Method Cost Notes Direct OpenAI $27.50 2.50 / M i n p u t + 2.50/M input + 2.50/ M in p u t + 10/M outputGateway Credits $30.25 ~10% markup Your Keys (Pro) $56.50 27.50 + 27.50 + 27.50 + 29/mo Pro plan
Breakeven point : ~100M tokens/month for Pro plan
For high-volume usage (>100M tokens/month), using your provider keys with Pro plan is most cost-effective.
Post-Migration Checklist
Monitor Performance
Response times are acceptable
Error rates are low (less than 1%)
No new timeout issues
Cache hit rate is improving
Getting Help
If you encounter issues during migration:
Next Steps
Projects Learn about project management and API keys.
Provider Keys Configure your provider API keys for direct billing.
Quickstart Complete quickstart guide for new users.
API Reference Explore the complete API documentation.