Overview
OpenRouter provides access to many LLM providers through a single API. LiteLLM seamlessly integrates with OpenRouter, supporting advanced features like provider routing, cost tracking, and prompt caching.
Quick Start
Set API Key
export OPENROUTER_API_KEY = "sk-or-..."
Make Your First Call
from litellm import completion
response = completion(
model = "openrouter/anthropic/claude-3.5-sonnet" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
print (response.choices[ 0 ].message.content)
Popular Models
Anthropic Claude
OpenAI
Google Gemini
Meta Llama
from litellm import completion
# Claude 3.5 Sonnet
response = completion(
model = "openrouter/anthropic/claude-3.5-sonnet" ,
messages = [{ "role" : "user" , "content" : "Explain AI" }]
)
# Claude 3 Opus
response = completion(
model = "openrouter/anthropic/claude-3-opus" ,
messages = [{ "role" : "user" , "content" : "Complex task" }]
)
# GPT-4o
response = completion(
model = "openrouter/openai/gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
# O1
response = completion(
model = "openrouter/openai/o1" ,
messages = [{ "role" : "user" , "content" : "Solve this..." }]
)
# Gemini Pro
response = completion(
model = "openrouter/google/gemini-pro" ,
messages = [{ "role" : "user" , "content" : "Analyze..." }]
)
# Gemini Flash
response = completion(
model = "openrouter/google/gemini-flash-1.5" ,
messages = [{ "role" : "user" , "content" : "Quick task" }]
)
# Llama 3.3 70B
response = completion(
model = "openrouter/meta-llama/llama-3.3-70b-instruct" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
Authentication
Environment Variable
Direct Parameter
export OPENROUTER_API_KEY = "sk-or-..."
from litellm import completion
response = completion(
model = "openrouter/anthropic/claude-3.5-sonnet" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
from litellm import completion
response = completion(
model = "openrouter/anthropic/claude-3.5-sonnet" ,
messages = [{ "role" : "user" , "content" : "Hello!" }],
api_key = "sk-or-..."
)
Streaming
from litellm import completion
response = completion(
model = "openrouter/anthropic/claude-3.5-sonnet" ,
messages = [{ "role" : "user" , "content" : "Write a story" }],
stream = True
)
for chunk in response:
if chunk.choices[ 0 ].delta.content:
print (chunk.choices[ 0 ].delta.content, end = "" )
Reasoning Models
OpenRouter supports reasoning models with thinking/reasoning content.
from litellm import completion
response = completion(
model = "openrouter/openai/o1" ,
messages = [{ "role" : "user" , "content" : "Solve this complex problem..." }],
reasoning_effort = "high" # For supported models
)
if response.choices[ 0 ].message.reasoning_content:
print ( "Reasoning:" , response.choices[ 0 ].message.reasoning_content)
print ( "Answer:" , response.choices[ 0 ].message.content)
Provider Routing
Control which providers OpenRouter uses.
from litellm import completion
response = completion(
model = "openrouter/anthropic/claude-3.5-sonnet" ,
messages = [{ "role" : "user" , "content" : "Hello!" }],
# Specify allowed providers
models = [ "anthropic/claude-3.5-sonnet" ],
# Or use routing preferences
route = "fallback" # or "least-busy"
)
Cost Tracking
LiteLLM automatically extracts cost information from OpenRouter.
from litellm import completion
response = completion(
model = "openrouter/anthropic/claude-3.5-sonnet" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
# Cost is automatically tracked in usage
if hasattr (response, '_hidden_params' ):
cost = response._hidden_params.get( 'additional_headers' , {}).get(
'llm_provider-x-litellm-response-cost'
)
if cost:
print ( f "Request cost: $ { cost } " )
Prompt Caching
OpenRouter supports prompt caching for Claude and Gemini models.
Claude Models
Gemini Models
from litellm import completion
# Cache system message
response = completion(
model = "openrouter/anthropic/claude-3.5-sonnet" ,
messages = [
{
"role" : "system" ,
"content" : "Long system prompt..." ,
"cache_control" : { "type" : "ephemeral" }
},
{ "role" : "user" , "content" : "Question?" }
]
)
Cache control is automatically moved to content blocks for OpenRouter compatibility.
response = completion(
model = "openrouter/google/gemini-pro" ,
messages = [
{
"role" : "user" ,
"content" : "Long context..." ,
"cache_control" : { "type" : "ephemeral" }
},
{ "role" : "user" , "content" : "Follow-up question" }
]
)
Embeddings
from litellm import embedding
response = embedding(
model = "openrouter/openai/text-embedding-3-small" ,
input = [ "Text to embed" , "Another text" ]
)
embeddings = [data.embedding for data in response.data]
Image Generation
from litellm import image_generation
response = image_generation(
model = "openrouter/openai/dall-e-3" ,
prompt = "A beautiful sunset over mountains" ,
n = 1 ,
size = "1024x1024"
)
image_url = response.data[ 0 ].url
Configuration
from litellm import completion
response = completion(
model = "openrouter/anthropic/claude-3.5-sonnet" ,
messages = [{ "role" : "user" , "content" : "Hello!" }],
temperature = 0.7 ,
max_tokens = 1000 ,
top_p = 0.9 ,
frequency_penalty = 0.5 ,
presence_penalty = 0.5 ,
# OpenRouter-specific
transforms = [ "middle-out" ], # Compression
models = [ "anthropic/claude-3.5-sonnet" ], # Provider preference
route = "fallback" # Routing strategy
)
Supported Parameters
Parameter Type Description temperaturefloat Randomness (0-2) max_tokensint Max output tokens max_completion_tokensint Alternative to max_tokens top_pfloat Nucleus sampling frequency_penaltyfloat Reduce repetition presence_penaltyfloat Encourage diversity stoplist Stop sequences nint Number of completions reasoning_effortstr Reasoning level transformslist Text transformations modelslist Provider preferences routestr Routing strategy
Error Handling
from litellm import completion
from litellm.exceptions import APIError, RateLimitError
try :
response = completion(
model = "openrouter/anthropic/claude-3.5-sonnet" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
except RateLimitError as e:
print ( f "Rate limit: { e } " )
except APIError as e:
print ( f "Error: { e.status_code } - { e.message } " )
# Check OpenRouter dashboard for credits
LiteLLM Proxy
model_list :
- model_name : claude-3.5-sonnet
litellm_params :
model : openrouter/anthropic/claude-3.5-sonnet
api_key : os.environ/OPENROUTER_API_KEY
- model_name : gpt-4o
litellm_params :
model : openrouter/openai/gpt-4o
api_key : os.environ/OPENROUTER_API_KEY
import openai
client = openai.OpenAI(
api_key = "sk-1234" ,
base_url = "http://0.0.0.0:4000"
)
response = client.chat.completions.create(
model = "claude-3.5-sonnet" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
Best Practices
Monitor costs via OpenRouter dashboard
Use cheaper models for simple tasks
Enable prompt caching for repeated contexts
LiteLLM automatically includes usage tracking
Use models parameter to control providers
Set route="fallback" for reliability
Different providers may have different capabilities
Supported Models
OpenRouter provides access to 100+ models. Visit openrouter.ai/models for the complete list.
Popular categories:
Anthropic Claude (all versions)
OpenAI GPT (all versions)
Google Gemini
Meta Llama
Mistral AI
Cohere
And many more
Model availability and pricing vary. Check OpenRouter’s website for current offerings.