Quick Start - AI Gateway (Proxy)

What is the LiteLLM Proxy?

The LiteLLM Proxy is a centralized AI Gateway that provides:

Authentication & Authorization - Virtual keys for secure access control
Cost Tracking - Per-user, per-project spend monitoring
Rate Limiting - Control usage with TPM/RPM limits
Load Balancing - Distribute requests across multiple deployments
Caching - Reduce costs with intelligent response caching
Admin Dashboard - Web UI for management and monitoring

Quick Installation

Install LiteLLM with Proxy

Install LiteLLM with proxy dependencies:

pip install 'litellm[proxy]'

Start the Proxy

Start the proxy with a single model (OpenAI GPT-4):

# Set your OpenAI API key
export OPENAI_API_KEY="your-openai-key"

# Start the proxy
litellm --model gpt-4o

The proxy will start on http://0.0.0.0:4000

Test Your Gateway

Make your first request using OpenAI SDK:

import openai

client = openai.OpenAI(
    api_key="anything",  # Can be anything when no auth is configured
    base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello from LiteLLM!"}]
)

print(response.choices[0].message.content)

Production Deployment: For production, use the configuration file approach below with authentication enabled.

Configuration File Setup

For production deployments, use a configuration file to define your models and settings.

Create Config File

Create a config.yaml file:

config.yaml

model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  
  - model_name: claude-3
    litellm_params:
      model: anthropic/claude-sonnet-4-20250514
      api_key: os.environ/ANTHROPIC_API_KEY
  
  - model_name: azure-gpt-4
    litellm_params:
      model: azure/gpt-4o
      api_key: os.environ/AZURE_API_KEY
      api_base: os.environ/AZURE_API_BASE
      api_version: "2025-02-01-preview"

# General settings
general_settings:
  master_key: "sk-1234"  # Change this!
  database_url: "postgresql://user:password@host:5432/dbname"  # Optional

Set Environment Variables

Set your API keys:

export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export AZURE_API_KEY="your-azure-key"
export AZURE_API_BASE="https://your-endpoint.openai.azure.com/"

Start with Config

Start the proxy with your config file:

litellm --config config.yaml --port 4000

Create Virtual Keys

Virtual keys provide secure access control with per-key budgets and rate limits.

curl -X POST 'http://0.0.0.0:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "models": ["gpt-4", "claude-3"],
    "max_budget": 10.0,
    "budget_duration": "30d",
    "tpm_limit": 100000,
    "rpm_limit": 100,
    "metadata": {"user": "[email protected]"}
  }'

Use Virtual Keys

Use the generated virtual key to make requests:

import openai

client = openai.OpenAI(
    api_key="sk-your-generated-virtual-key",
    base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

Load Balancing

Distribute requests across multiple deployments of the same model:

config.yaml

model_list:
  # Multiple deployments of the same model
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY_1
  
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY_2
  
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4o
      api_key: os.environ/AZURE_API_KEY
      api_base: os.environ/AZURE_API_BASE

router_settings:
  routing_strategy: "least-busy"  # or "simple-shuffle", "latency-based-routing"
  num_retries: 2

Fallbacks

Automatically fallback to alternative models on failure:

config.yaml

model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  
  - model_name: claude-fallback
    litellm_params:
      model: anthropic/claude-sonnet-4-20250514
      api_key: os.environ/ANTHROPIC_API_KEY

router_settings:
  fallbacks:
    - gpt-4: ["claude-fallback"]
  num_retries: 2

Caching

Enable caching to reduce costs and improve response times:

config.yaml

model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY

general_settings:
  cache: true
  cache_params:
    type: "redis"
    host: "localhost"
    port: 6379
    ttl: 600  # Cache for 10 minutes

Admin Dashboard

Access the web-based admin dashboard to:

Create and manage virtual keys
Monitor usage and costs
View request logs
Configure models and settings

Access Dashboard

Open your browser and navigate to:

http://0.0.0.0:4000/ui

Master Key: sk-1234 (or the value from your config)

Explore Features

Keys: Create and manage virtual keys
Models: View and configure available models
Usage: Monitor costs and request metrics
Logs: View detailed request logs

Docker Deployment

Deploy using Docker for production:

# docker-compose.yml
version: '3.8'

services:
  litellm:
    image: ghcr.io/berriai/litellm:main-stable
    ports:
      - "4000:4000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - DATABASE_URL=${DATABASE_URL}
    volumes:
      - ./config.yaml:/app/config.yaml
    command: ["--config", "/app/config.yaml", "--port", "4000"]
  
  postgres:
    image: postgres:15-alpine
    environment:
      - POSTGRES_DB=litellm
      - POSTGRES_USER=litellm
      - POSTGRES_PASSWORD=your-secure-password
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

Observability

Integrate with observability platforms:

config.yaml

model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY

litellm_settings:
  success_callback: ["langfuse", "prometheus", "datadog"]
  
  # Langfuse configuration
  langfuse_public_key: os.environ/LANGFUSE_PUBLIC_KEY
  langfuse_secret_key: os.environ/LANGFUSE_SECRET_KEY
  langfuse_host: "https://cloud.langfuse.com"
  
  # Prometheus configuration
  prometheus: true
  
  # Datadog configuration
  datadog_api_key: os.environ/DATADOG_API_KEY
  datadog_site: "datadoghq.com"

API Endpoints

The proxy exposes OpenAI-compatible endpoints:

Chat Completions

POST /chat/completions

Completions

POST /completions

Embeddings

POST /embeddings

Images

POST /images/generations

Audio

POST /audio/transcriptions

Models

GET /models

Environment Variables

Common environment variables:

# API Keys
export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"
export AZURE_API_KEY="your-key"

# Database (Optional)
export DATABASE_URL="postgresql://user:password@host:5432/dbname"

# Redis (Optional)
export REDIS_HOST="localhost"
export REDIS_PORT="6379"

# Proxy Settings
export LITELLM_MASTER_KEY="sk-1234"
export LITELLM_PORT="4000"

What’s Next?

Authentication

Set up SSO, LDAP, or custom authentication

Guardrails

Add content moderation and safety guardrails

Teams & Projects

Organize users into teams with separate budgets

Enterprise Features

Explore enterprise features like SSO and SLAs

Security Best Practices

Always change the default master_key
Use environment variables for sensitive data
Enable HTTPS in production
Use a PostgreSQL database for persistence
Regularly rotate API keys

Need Help? Join our Discord community or check out the full documentation.

Get Started

Python SDK

AI Gateway (Proxy)

Core Features

Advanced

Quick Start - AI Gateway (Proxy)

What is the LiteLLM Proxy?

Quick Installation

Configuration File Setup

Create Virtual Keys

Use Virtual Keys

Load Balancing

Fallbacks

Caching

Admin Dashboard

Docker Deployment

Observability

API Endpoints

Chat Completions

Completions

Embeddings

Images

Audio

Models

Environment Variables

What’s Next?

Authentication

Guardrails

Teams & Projects

Enterprise Features

Build docs developers (and LLMs) love

Get Started

Python SDK

AI Gateway (Proxy)

Core Features

Advanced

​What is the LiteLLM Proxy?

​Quick Installation

​Configuration File Setup

​Create Virtual Keys

​Use Virtual Keys

​Load Balancing

​Fallbacks

​Caching

​Admin Dashboard

​Docker Deployment

​Observability

​API Endpoints

Chat Completions

Completions

Embeddings

Images

Audio

Models

​Environment Variables

​What’s Next?

Authentication

Guardrails

Teams & Projects

Enterprise Features

Build docs developers (and LLMs) love

What is the LiteLLM Proxy?

Quick Installation

Configuration File Setup

Create Virtual Keys

Use Virtual Keys

Load Balancing

Fallbacks

Caching

Admin Dashboard

Docker Deployment

Observability

API Endpoints

Environment Variables

What’s Next?