Skip to main content

What is the LiteLLM Proxy?

The LiteLLM Proxy is a centralized AI Gateway that provides:
  • Authentication & Authorization - Virtual keys for secure access control
  • Cost Tracking - Per-user, per-project spend monitoring
  • Rate Limiting - Control usage with TPM/RPM limits
  • Load Balancing - Distribute requests across multiple deployments
  • Caching - Reduce costs with intelligent response caching
  • Admin Dashboard - Web UI for management and monitoring

Quick Installation

1

Install LiteLLM with Proxy

Install LiteLLM with proxy dependencies:
pip install 'litellm[proxy]'
2

Start the Proxy

Start the proxy with a single model (OpenAI GPT-4):
# Set your OpenAI API key
export OPENAI_API_KEY="your-openai-key"

# Start the proxy
litellm --model gpt-4o
The proxy will start on http://0.0.0.0:4000
3

Test Your Gateway

Make your first request using OpenAI SDK:
import openai

client = openai.OpenAI(
    api_key="anything",  # Can be anything when no auth is configured
    base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello from LiteLLM!"}]
)

print(response.choices[0].message.content)
Production Deployment: For production, use the configuration file approach below with authentication enabled.

Configuration File Setup

For production deployments, use a configuration file to define your models and settings.
1

Create Config File

Create a config.yaml file:
config.yaml
model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  
  - model_name: claude-3
    litellm_params:
      model: anthropic/claude-sonnet-4-20250514
      api_key: os.environ/ANTHROPIC_API_KEY
  
  - model_name: azure-gpt-4
    litellm_params:
      model: azure/gpt-4o
      api_key: os.environ/AZURE_API_KEY
      api_base: os.environ/AZURE_API_BASE
      api_version: "2025-02-01-preview"

# General settings
general_settings:
  master_key: "sk-1234"  # Change this!
  database_url: "postgresql://user:password@host:5432/dbname"  # Optional
2

Set Environment Variables

Set your API keys:
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export AZURE_API_KEY="your-azure-key"
export AZURE_API_BASE="https://your-endpoint.openai.azure.com/"
3

Start with Config

Start the proxy with your config file:
litellm --config config.yaml --port 4000

Create Virtual Keys

Virtual keys provide secure access control with per-key budgets and rate limits.
curl -X POST 'http://0.0.0.0:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "models": ["gpt-4", "claude-3"],
    "max_budget": 10.0,
    "budget_duration": "30d",
    "tpm_limit": 100000,
    "rpm_limit": 100,
    "metadata": {"user": "[email protected]"}
  }'

Use Virtual Keys

Use the generated virtual key to make requests:
import openai

client = openai.OpenAI(
    api_key="sk-your-generated-virtual-key",
    base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

Load Balancing

Distribute requests across multiple deployments of the same model:
config.yaml
model_list:
  # Multiple deployments of the same model
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY_1
  
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY_2
  
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4o
      api_key: os.environ/AZURE_API_KEY
      api_base: os.environ/AZURE_API_BASE

router_settings:
  routing_strategy: "least-busy"  # or "simple-shuffle", "latency-based-routing"
  num_retries: 2

Fallbacks

Automatically fallback to alternative models on failure:
config.yaml
model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  
  - model_name: claude-fallback
    litellm_params:
      model: anthropic/claude-sonnet-4-20250514
      api_key: os.environ/ANTHROPIC_API_KEY

router_settings:
  fallbacks:
    - gpt-4: ["claude-fallback"]
  num_retries: 2

Caching

Enable caching to reduce costs and improve response times:
config.yaml
model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY

general_settings:
  cache: true
  cache_params:
    type: "redis"
    host: "localhost"
    port: 6379
    ttl: 600  # Cache for 10 minutes

Admin Dashboard

Access the web-based admin dashboard to:
  • Create and manage virtual keys
  • Monitor usage and costs
  • View request logs
  • Configure models and settings
1

Access Dashboard

Open your browser and navigate to:
http://0.0.0.0:4000/ui
2

Login

Login with your master key:
  • Master Key: sk-1234 (or the value from your config)
3

Explore Features

  • Keys: Create and manage virtual keys
  • Models: View and configure available models
  • Usage: Monitor costs and request metrics
  • Logs: View detailed request logs

Docker Deployment

Deploy using Docker for production:
# docker-compose.yml
version: '3.8'

services:
  litellm:
    image: ghcr.io/berriai/litellm:main-stable
    ports:
      - "4000:4000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - DATABASE_URL=${DATABASE_URL}
    volumes:
      - ./config.yaml:/app/config.yaml
    command: ["--config", "/app/config.yaml", "--port", "4000"]
  
  postgres:
    image: postgres:15-alpine
    environment:
      - POSTGRES_DB=litellm
      - POSTGRES_USER=litellm
      - POSTGRES_PASSWORD=your-secure-password
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

Observability

Integrate with observability platforms:
config.yaml
model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY

litellm_settings:
  success_callback: ["langfuse", "prometheus", "datadog"]
  
  # Langfuse configuration
  langfuse_public_key: os.environ/LANGFUSE_PUBLIC_KEY
  langfuse_secret_key: os.environ/LANGFUSE_SECRET_KEY
  langfuse_host: "https://cloud.langfuse.com"
  
  # Prometheus configuration
  prometheus: true
  
  # Datadog configuration
  datadog_api_key: os.environ/DATADOG_API_KEY
  datadog_site: "datadoghq.com"

API Endpoints

The proxy exposes OpenAI-compatible endpoints:

Chat Completions

POST /chat/completions

Completions

POST /completions

Embeddings

POST /embeddings

Images

POST /images/generations

Audio

POST /audio/transcriptions

Models

GET /models

Environment Variables

Common environment variables:
# API Keys
export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"
export AZURE_API_KEY="your-key"

# Database (Optional)
export DATABASE_URL="postgresql://user:password@host:5432/dbname"

# Redis (Optional)
export REDIS_HOST="localhost"
export REDIS_PORT="6379"

# Proxy Settings
export LITELLM_MASTER_KEY="sk-1234"
export LITELLM_PORT="4000"

What’s Next?

Authentication

Set up SSO, LDAP, or custom authentication

Guardrails

Add content moderation and safety guardrails

Teams & Projects

Organize users into teams with separate budgets

Enterprise Features

Explore enterprise features like SSO and SLAs
Security Best Practices
  • Always change the default master_key
  • Use environment variables for sensitive data
  • Enable HTTPS in production
  • Use a PostgreSQL database for persistence
  • Regularly rotate API keys
Need Help? Join our Discord community or check out the full documentation.

Build docs developers (and LLMs) love