What is the LiteLLM Proxy?
The LiteLLM Proxy is a centralized AI Gateway that provides:
Authentication & Authorization - Virtual keys for secure access control
Cost Tracking - Per-user, per-project spend monitoring
Rate Limiting - Control usage with TPM/RPM limits
Load Balancing - Distribute requests across multiple deployments
Caching - Reduce costs with intelligent response caching
Admin Dashboard - Web UI for management and monitoring
Quick Installation
Install LiteLLM with Proxy
Install LiteLLM with proxy dependencies: pip install 'litellm[proxy]'
Start the Proxy
Start the proxy with a single model (OpenAI GPT-4): # Set your OpenAI API key
export OPENAI_API_KEY = "your-openai-key"
# Start the proxy
litellm --model gpt-4o
The proxy will start on http://0.0.0.0:4000
Test Your Gateway
Make your first request using OpenAI SDK: import openai
client = openai.OpenAI(
api_key = "anything" , # Can be anything when no auth is configured
base_url = "http://0.0.0.0:4000"
)
response = client.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Hello from LiteLLM!" }]
)
print (response.choices[ 0 ].message.content)
Production Deployment : For production, use the configuration file approach below with authentication enabled.
Configuration File Setup
For production deployments, use a configuration file to define your models and settings.
Create Config File
Create a config.yaml file: model_list :
- model_name : gpt-4
litellm_params :
model : openai/gpt-4o
api_key : os.environ/OPENAI_API_KEY
- model_name : claude-3
litellm_params :
model : anthropic/claude-sonnet-4-20250514
api_key : os.environ/ANTHROPIC_API_KEY
- model_name : azure-gpt-4
litellm_params :
model : azure/gpt-4o
api_key : os.environ/AZURE_API_KEY
api_base : os.environ/AZURE_API_BASE
api_version : "2025-02-01-preview"
# General settings
general_settings :
master_key : "sk-1234" # Change this!
database_url : "postgresql://user:password@host:5432/dbname" # Optional
Set Environment Variables
Set your API keys: export OPENAI_API_KEY = "your-openai-key"
export ANTHROPIC_API_KEY = "your-anthropic-key"
export AZURE_API_KEY = "your-azure-key"
export AZURE_API_BASE = "https://your-endpoint.openai.azure.com/"
Start with Config
Start the proxy with your config file: litellm --config config.yaml --port 4000
Create Virtual Keys
Virtual keys provide secure access control with per-key budgets and rate limits.
curl -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"models": ["gpt-4", "claude-3"],
"max_budget": 10.0,
"budget_duration": "30d",
"tpm_limit": 100000,
"rpm_limit": 100,
"metadata": {"user": "[email protected] "}
}'
Use Virtual Keys
Use the generated virtual key to make requests:
import openai
client = openai.OpenAI(
api_key = "sk-your-generated-virtual-key" ,
base_url = "http://0.0.0.0:4000"
)
response = client.chat.completions.create(
model = "gpt-4" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
Load Balancing
Distribute requests across multiple deployments of the same model:
model_list :
# Multiple deployments of the same model
- model_name : gpt-4
litellm_params :
model : openai/gpt-4o
api_key : os.environ/OPENAI_API_KEY_1
- model_name : gpt-4
litellm_params :
model : openai/gpt-4o
api_key : os.environ/OPENAI_API_KEY_2
- model_name : gpt-4
litellm_params :
model : azure/gpt-4o
api_key : os.environ/AZURE_API_KEY
api_base : os.environ/AZURE_API_BASE
router_settings :
routing_strategy : "least-busy" # or "simple-shuffle", "latency-based-routing"
num_retries : 2
Fallbacks
Automatically fallback to alternative models on failure:
model_list :
- model_name : gpt-4
litellm_params :
model : openai/gpt-4o
api_key : os.environ/OPENAI_API_KEY
- model_name : claude-fallback
litellm_params :
model : anthropic/claude-sonnet-4-20250514
api_key : os.environ/ANTHROPIC_API_KEY
router_settings :
fallbacks :
- gpt-4 : [ "claude-fallback" ]
num_retries : 2
Caching
Enable caching to reduce costs and improve response times:
model_list :
- model_name : gpt-4
litellm_params :
model : openai/gpt-4o
api_key : os.environ/OPENAI_API_KEY
general_settings :
cache : true
cache_params :
type : "redis"
host : "localhost"
port : 6379
ttl : 600 # Cache for 10 minutes
Admin Dashboard
Access the web-based admin dashboard to:
Create and manage virtual keys
Monitor usage and costs
View request logs
Configure models and settings
Access Dashboard
Open your browser and navigate to:
Login
Login with your master key:
Master Key: sk-1234 (or the value from your config)
Explore Features
Keys : Create and manage virtual keys
Models : View and configure available models
Usage : Monitor costs and request metrics
Logs : View detailed request logs
Docker Deployment
Deploy using Docker for production:
# docker-compose.yml
version: '3.8'
services:
litellm:
image: ghcr.io/berriai/litellm:main-stable
ports:
- "4000:4000"
environment:
- OPENAI_API_KEY= ${ OPENAI_API_KEY }
- ANTHROPIC_API_KEY= ${ ANTHROPIC_API_KEY }
- DATABASE_URL= ${ DATABASE_URL }
volumes:
- ./config.yaml:/app/config.yaml
command : [ "--config" , "/app/config.yaml", "--port", "4000"]
postgres:
image: postgres:15-alpine
environment:
- POSTGRES_DB=litellm
- POSTGRES_USER=litellm
- POSTGRES_PASSWORD=your-secure-password
volumes:
- postgres_data:/var/lib/postgresql/data
volumes:
postgres_data:
Observability
Integrate with observability platforms:
model_list :
- model_name : gpt-4
litellm_params :
model : openai/gpt-4o
api_key : os.environ/OPENAI_API_KEY
litellm_settings :
success_callback : [ "langfuse" , "prometheus" , "datadog" ]
# Langfuse configuration
langfuse_public_key : os.environ/LANGFUSE_PUBLIC_KEY
langfuse_secret_key : os.environ/LANGFUSE_SECRET_KEY
langfuse_host : "https://cloud.langfuse.com"
# Prometheus configuration
prometheus : true
# Datadog configuration
datadog_api_key : os.environ/DATADOG_API_KEY
datadog_site : "datadoghq.com"
API Endpoints
The proxy exposes OpenAI-compatible endpoints:
Chat Completions POST /chat/completions
Completions POST /completions
Embeddings POST /embeddings
Images POST /images/generations
Audio POST /audio/transcriptions
Environment Variables
Common environment variables:
# API Keys
export OPENAI_API_KEY = "your-key"
export ANTHROPIC_API_KEY = "your-key"
export AZURE_API_KEY = "your-key"
# Database (Optional)
export DATABASE_URL = "postgresql://user:password@host:5432/dbname"
# Redis (Optional)
export REDIS_HOST = "localhost"
export REDIS_PORT = "6379"
# Proxy Settings
export LITELLM_MASTER_KEY = "sk-1234"
export LITELLM_PORT = "4000"
What’s Next?
Authentication Set up SSO, LDAP, or custom authentication
Guardrails Add content moderation and safety guardrails
Teams & Projects Organize users into teams with separate budgets
Enterprise Features Explore enterprise features like SSO and SLAs
Security Best Practices
Always change the default master_key
Use environment variables for sensitive data
Enable HTTPS in production
Use a PostgreSQL database for persistence
Regularly rotate API keys