Configuration Schema

Overview

The LiteLLM proxy uses a YAML configuration file to define models, routing, authentication, and other settings.

File Location

Default: config.yaml Start proxy with config:

litellm --config config.yaml

Complete Schema

model_list:
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4
      api_base: https://your-endpoint.openai.azure.com/
      api_key: os.environ/AZURE_API_KEY
      api_version: "2024-02-01"
    model_info:
      mode: chat
      supports_function_calling: true
      supports_vision: true

litellm_settings:
  success_callback: ["langfuse", "lunary"]
  failure_callback: ["sentry"]
  set_verbose: true
  drop_params: true
  max_parallel_requests: 100
  request_timeout: 600
  num_retries: 3
  fallbacks:
    - gpt-4: ["gpt-3.5-turbo", "claude-2"]
  context_window_fallbacks:
    - gpt-3.5-turbo: ["gpt-3.5-turbo-16k"]

general_settings:
  master_key: sk-1234
  database_url: postgresql://...
  store_model_in_db: true
  allowed_routes: ["chat/completions", "embeddings"]
  key_management_settings:
    default_key_duration: 30d
    max_key_duration: 365d

router_settings:
  routing_strategy: latency-based-routing
  routing_strategy_args:
    ttl: 60
  model_group_alias:
    gpt-4: production-gpt-4
  redis_host: localhost
  redis_port: 6379
  redis_password: os.environ/REDIS_PASSWORD
  num_retries: 3
  timeout: 30
  allowed_fails: 3
  cooldown_time: 60

Configuration Sections

model_list

Define model deployments for the proxy.

model_name

string

required

User-facing model name. Multiple deployments can share the same model_name for load balancing.

litellm_params

object

required

Parameters passed to litellm.completion().

Show litellm_params fields

model

string

required

Provider-specific model identifier.Examples:

azure/gpt-4
bedrock/anthropic.claude-v2
vertex_ai/gemini-pro

api_key

string

API key. Use os.environ/VAR_NAME to load from environment.

api_base

string

API endpoint base URL.

api_version

string

API version (provider-specific).

timeout

number

Request timeout in seconds.

model_info

object

Metadata about the model.

Show model_info fields

mode

string

Model mode: "chat", "completion", "embedding", "image_generation"

input_cost_per_token

number

Cost per input token in USD.

output_cost_per_token

number

Cost per output token in USD.

max_tokens

number

Maximum tokens supported.

supports_function_calling

boolean

Whether model supports function calling.

supports_vision

boolean

Whether model supports vision/image inputs.

tpm

number

Tokens per minute limit for this deployment.

rpm

number

Requests per minute limit for this deployment.

Example

model_list:
  # OpenAI GPT-4
  - model_name: gpt-4
    litellm_params:
      model: gpt-4
      api_key: os.environ/OPENAI_API_KEY
    tpm: 100000
    rpm: 1000
  
  # Azure GPT-4
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4
      api_base: https://your-endpoint.openai.azure.com/
      api_key: os.environ/AZURE_API_KEY
      api_version: "2024-02-01"
    tpm: 200000
    rpm: 2000
  
  # Claude 2
  - model_name: claude-2
    litellm_params:
      model: claude-2
      api_key: os.environ/ANTHROPIC_API_KEY
    tpm: 100000
    rpm: 1000
  
  # Bedrock Claude
  - model_name: claude-bedrock
    litellm_params:
      model: bedrock/anthropic.claude-v2
      aws_region_name: us-east-1
  
  # Embedding model
  - model_name: text-embedding-3-small
    litellm_params:
      model: text-embedding-3-small
      api_key: os.environ/OPENAI_API_KEY

litellm_settings

Global LiteLLM configuration.

success_callback

array

Callbacks to run on successful requests.Supported: langfuse, lunary, helicone, supabase, datadog, prometheus, custom

failure_callback

array

Callbacks to run on failed requests.Supported: sentry, slack, webhook, custom

set_verbose

boolean

default:"false"

Enable verbose logging.

drop_params

boolean

default:"false"

Drop unsupported parameters instead of erroring.

max_parallel_requests

number

Maximum parallel requests.

request_timeout

number

default:"600"

Default request timeout in seconds.

num_retries

number

default:"0"

Number of retries on failure.

fallbacks

array

Fallback model configurations.

fallbacks:
  - gpt-4: ["gpt-3.5-turbo", "claude-2"]
  - claude-2: ["gpt-3.5-turbo"]

context_window_fallbacks

array

Fallbacks for context window exceeded errors.

context_window_fallbacks:
  - gpt-3.5-turbo: ["gpt-3.5-turbo-16k"]
  - gpt-4: ["gpt-4-32k"]

cache

boolean

default:"false"

Enable caching.

cache_params

object

Caching configuration.

cache_params:
  type: redis
  host: localhost
  port: 6379
  ttl: 3600

Example

litellm_settings:
  success_callback: ["langfuse", "prometheus"]
  failure_callback: ["sentry", "slack"]
  set_verbose: true
  drop_params: true
  request_timeout: 300
  num_retries: 3
  fallbacks:
    - gpt-4: ["gpt-3.5-turbo"]
  cache: true
  cache_params:
    type: redis
    host: localhost
    port: 6379
    ttl: 3600

general_settings

Proxy server configuration.

master_key

string

Admin master key for proxy management.

master_key: sk-1234
# or from environment
master_key: os.environ/LITELLM_MASTER_KEY

database_url

string

PostgreSQL connection string for key/user/team storage.

database_url: postgresql://user:pass@localhost:5432/litellm

store_model_in_db

boolean

default:"false"

Store model configurations in database.

allowed_routes

array

Restrict which endpoints are enabled.

allowed_routes:
  - "chat/completions"
  - "embeddings"
  - "key/generate"

ui_access_mode

string

default:"admin"

UI access control.Options: "admin", "all"

max_budget

number

Global budget limit.

budget_duration

string

Budget reset period: "1d", "30d", etc.

Example

general_settings:
  master_key: os.environ/LITELLM_MASTER_KEY
  database_url: os.environ/DATABASE_URL
  store_model_in_db: true
  allowed_routes:
    - "chat/completions"
    - "embeddings"
    - "key/generate"
    - "key/list"
    - "team/new"
  ui_access_mode: admin
  max_budget: 10000.0
  budget_duration: 30d

router_settings

Router configuration for load balancing.

routing_strategy

string

default:"simple-shuffle"

Load balancing strategy.Options:

simple-shuffle: Random selection
least-busy: Fewest ongoing requests
usage-based-routing: Based on TPM/RPM
latency-based-routing: Lowest latency
cost-based-routing: Lowest cost

routing_strategy_args

object

Strategy-specific arguments.

routing_strategy_args:
  ttl: 60  # For latency-based routing

model_group_alias

object

Model aliases.

model_group_alias:
  gpt-4: production-gpt-4
  claude: production-claude

redis_host

string

Redis host for caching.

redis_port

number

default:"6379"

Redis port.

redis_password

string

Redis password.

num_retries

number

default:"0"

Router-level retries.

timeout

number

default:"600"

Router timeout.

allowed_fails

number

default:"3"

Failures before cooldown.

cooldown_time

number

default:"60"

Cooldown duration in seconds.

Example

router_settings:
  routing_strategy: latency-based-routing
  routing_strategy_args:
    ttl: 60
  model_group_alias:
    gpt-4: prod-gpt-4
  redis_host: localhost
  redis_port: 6379
  redis_password: os.environ/REDIS_PASSWORD
  num_retries: 3
  timeout: 30
  allowed_fails: 5
  cooldown_time: 120

Complete Example

model_list:
  # GPT-4 with load balancing
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4
      api_base: https://endpoint1.openai.azure.com/
      api_key: os.environ/AZURE_API_KEY_1
      api_version: "2024-02-01"
    tpm: 100000
    rpm: 1000
  
  - model_name: gpt-4
    litellm_params:
      model: gpt-4
      api_key: os.environ/OPENAI_API_KEY
    tpm: 90000
    rpm: 900
  
  # GPT-3.5-Turbo
  - model_name: gpt-3.5-turbo
    litellm_params:
      model: gpt-3.5-turbo
      api_key: os.environ/OPENAI_API_KEY
    tpm: 1000000
    rpm: 10000
  
  # Claude
  - model_name: claude-2
    litellm_params:
      model: claude-2
      api_key: os.environ/ANTHROPIC_API_KEY
    tpm: 100000
    rpm: 1000
  
  # Embeddings
  - model_name: text-embedding-3-small
    litellm_params:
      model: text-embedding-3-small
      api_key: os.environ/OPENAI_API_KEY

litellm_settings:
  success_callback: ["langfuse", "prometheus"]
  failure_callback: ["sentry"]
  set_verbose: false
  drop_params: true
  request_timeout: 300
  num_retries: 3
  fallbacks:
    - gpt-4: ["gpt-3.5-turbo", "claude-2"]
  context_window_fallbacks:
    - gpt-3.5-turbo: ["gpt-3.5-turbo-16k"]
  cache: true
  cache_params:
    type: redis
    host: localhost
    port: 6379
    ttl: 3600

general_settings:
  master_key: os.environ/LITELLM_MASTER_KEY
  database_url: os.environ/DATABASE_URL
  store_model_in_db: true
  allowed_routes:
    - "chat/completions"
    - "embeddings"
    - "key/generate"
    - "key/list"
    - "team/new"
    - "user/new"
  ui_access_mode: admin

router_settings:
  routing_strategy: latency-based-routing
  routing_strategy_args:
    ttl: 60
  redis_host: localhost
  redis_port: 6379
  redis_password: os.environ/REDIS_PASSWORD
  num_retries: 3
  timeout: 30
  allowed_fails: 3
  cooldown_time: 60

Environment Variables

Load values from environment:

model_list:
  - model_name: gpt-4
    litellm_params:
      model: azure/gpt-4
      api_key: os.environ/AZURE_API_KEY  # Loads from $AZURE_API_KEY
      api_base: os.environ/AZURE_API_BASE

general_settings:
  master_key: os.environ/LITELLM_MASTER_KEY
  database_url: os.environ/DATABASE_URL

Validation

Validate your config:

litellm --config config.yaml --test

SDK Reference

Proxy Endpoints

Configuration

Configuration Schema

Overview

File Location

Complete Schema

Configuration Sections

model_list

Example

litellm_settings

Example

general_settings

Example

router_settings

Example

Complete Example

Environment Variables

Validation

Build docs developers (and LLMs) love

SDK Reference

Proxy Endpoints

Configuration

​Overview

​File Location

​Complete Schema

​Configuration Sections

​model_list

​Example

​litellm_settings

​Example

​general_settings

​Example

​router_settings

​Example

​Complete Example

​Environment Variables

​Validation

​Related

Build docs developers (and LLMs) love

Overview

File Location

Complete Schema

Configuration Sections

model_list

Example

litellm_settings

Example

general_settings

Example

router_settings

Example

Complete Example

Environment Variables

Validation

Related