Skip to main content

Overview

LiteLLM Proxy provides comprehensive budget tracking and alerting capabilities:
  • Budget Enforcement: Hard limits to prevent overspending
  • Soft Budgets: Warnings before hitting limits
  • Webhook Alerts: Real-time notifications
  • Slack Integration: Team notifications
  • Spend Tracking: Monitor usage across keys, users, and teams

Budget Types

Global Budget

Set a budget for the entire proxy:
config.yaml
litellm_settings:
  max_budget: 1000.0       # Maximum spend in USD
  budget_duration: 30d     # Budget reset period
When the global budget is exceeded, all requests are rejected:
{
  "error": {
    "message": "Proxy budget exceeded. Max budget: $1000.00",
    "type": "budget_exceeded"
  }
}

Key Budget

Set budgets per virtual key:
curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "models": ["gpt-3.5-turbo"],
    "max_budget": 100.0,
    "budget_duration": "30d"
  }'

Team Budget

Set budgets for teams:
curl -X POST 'http://localhost:4000/team/new' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "team_alias": "engineering",
    "max_budget": 1000.0,
    "budget_duration": "30d"
  }'

User Budget

Set budgets per user:
curl -X POST 'http://localhost:4000/user/new' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "user_id": "[email protected]",
    "max_budget": 50.0,
    "budget_duration": "30d"
  }'

Soft Budgets (Alerts)

Soft budgets trigger alerts without blocking requests:
curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "models": ["gpt-3.5-turbo"],
    "max_budget": 100.0,
    "soft_budget": 80.0,     # Alert at $80
    "budget_duration": "30d"
  }'
1

Set Soft Budget

Configure soft budget threshold (80% of max budget)
2

Monitor Usage

Proxy tracks spending in real-time
3

Receive Alert

Webhook/Slack notification when soft budget is exceeded
4

Take Action

Increase budget or optimize usage before hitting hard limit

Slack Alerts

Configure Slack notifications for budget alerts:

Setup

1

Create Slack Webhook

  1. Go to https://api.slack.com/apps
  2. Create a new app
  3. Enable Incoming Webhooks
  4. Create a webhook URL
2

Configure Proxy

Add Slack settings to your config:
config.yaml
litellm_settings:
  alerting: ["slack"]
  alerting_threshold: 0.8  # Alert at 80% budget
  
  # Slack webhook URL
  slack_webhook_url: os.environ/SLACK_WEBHOOK_URL
  
  # Optional: Customize alert message
  slack_alert_to_webhook_url: os.environ/SLACK_WEBHOOK_URL
3

Set Environment Variable

export SLACK_WEBHOOK_URL="https://hooks.slack.com/services/..."
4

Restart Proxy

litellm --config config.yaml

Alert Example

When a budget threshold is reached, Slack receives:
🚨 Budget Alert

Key: sk-1234...abcd
Spend: $82.45 / $100.00 (82%)
Budget Reset: 2024-04-15 10:30:00 UTC

Consider increasing budget or optimizing usage.

Webhook Alerts

Send budget alerts to custom webhooks:
config.yaml
litellm_settings:
  alerting: ["webhook"]
  alerting_threshold: 0.8
  
  # Custom webhook URL
  webhook_url: https://example.com/alerts
Webhook payload:
{
  "event": "budget_alert",
  "timestamp": "2024-04-15T10:30:00Z",
  "key": "sk-1234...abcd",
  "key_alias": "production-key",
  "spend": 82.45,
  "max_budget": 100.0,
  "budget_threshold": 80.0,
  "percentage": 0.8245,
  "budget_reset_at": "2024-05-15T10:30:00Z",
  "metadata": {
    "environment": "production",
    "team": "backend"
  }
}

Email Alerts

Configure email notifications:
config.yaml
litellm_settings:
  alerting: ["email"]
  alerting_threshold: 0.8
  
  # Email settings
  email_alerts_to: ["[email protected]", "[email protected]"]
  
  # SMTP configuration
  smtp_host: os.environ/SMTP_HOST
  smtp_port: 587
  smtp_user: os.environ/SMTP_USER
  smtp_password: os.environ/SMTP_PASSWORD
  smtp_sender: [email protected]
Required environment variables:
export SMTP_HOST="smtp.gmail.com"
export SMTP_USER="[email protected]"
export SMTP_PASSWORD="your-app-password"

Budget Reset Schedule

Budgets automatically reset based on the configured duration:
max_budget: 100.0
budget_duration: 1d

Manual Budget Reset

Reset a key’s budget manually:
curl -X POST 'http://localhost:4000/key/update' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "key": "sk-1234567890abcdef",
    "spend": 0.0  # Reset spend to 0
  }'

Spend Tracking

View Key Spend

curl -X GET 'http://localhost:4000/key/info' \
  -H 'Authorization: Bearer sk-1234567890abcdef'
Response:
{
  "key": "sk-1234...def",
  "spend": 45.23,
  "max_budget": 100.0,
  "budget_reset_at": "2024-04-15T10:30:00Z",
  "models": ["gpt-3.5-turbo"]
}

View All Keys Spend

curl -X GET 'http://localhost:4000/spend/keys' \
  -H 'Authorization: Bearer sk-1234'

View Team Spend

curl -X GET 'http://localhost:4000/spend/teams' \
  -H 'Authorization: Bearer sk-1234'

View User Spend

curl -X GET 'http://localhost:4000/spend/users' \
  -H 'Authorization: Bearer sk-1234'

Projected Spend Alerts

Alert based on projected monthly spend:
config.yaml
general_settings:
  projected_spend_alerts: true
  projected_spend_threshold: 0.9  # Alert at 90% of projected

litellm_settings:
  alerting: ["slack"]
  slack_webhook_url: os.environ/SLACK_WEBHOOK_URL
If current usage trends suggest exceeding budget, alerts are sent proactively.

Advanced Alerting

Multi-Channel Alerts

Send to multiple channels:
config.yaml
litellm_settings:
  alerting: ["slack", "webhook", "email"]
  alerting_threshold: 0.8
  
  slack_webhook_url: os.environ/SLACK_WEBHOOK_URL
  webhook_url: https://example.com/alerts
  email_alerts_to: ["[email protected]"]

Team-Specific Alerts

Configure different alert settings per team:
config.yaml
litellm_settings:
  default_team_settings:
    - team_id: team-engineering
      alerting: ["slack"]
      alerting_threshold: 0.8
      slack_webhook_url: os.environ/ENGINEERING_SLACK_WEBHOOK
    
    - team_id: team-marketing
      alerting: ["email"]
      alerting_threshold: 0.9
      email_alerts_to: ["[email protected]"]

Custom Alert Thresholds

config.yaml
litellm_settings:
  # Alert at multiple thresholds
  alerting_thresholds: [0.5, 0.75, 0.9, 1.0]
  
  alerting: ["slack"]
  slack_webhook_url: os.environ/SLACK_WEBHOOK_URL
Alerts are sent at 50%, 75%, 90%, and 100% of budget.

Monitoring Dashboard

Admin UI

Access the admin dashboard at http://localhost:4000/ui:
  • Overview: Total spend across all keys
  • Keys: Individual key spend and budgets
  • Teams: Team-level spend tracking
  • Users: Per-user spend analysis
  • Charts: Spend trends over time

Prometheus Metrics

Export budget metrics to Prometheus:
config.yaml
litellm_settings:
  success_callback: ["prometheus"]
Metrics available:
  • litellm_spend_total - Total spend
  • litellm_key_spend - Per-key spend
  • litellm_team_spend - Per-team spend
  • litellm_budget_remaining - Remaining budget
Query in Prometheus:
# Keys approaching budget limit
litellm_key_spend / litellm_key_budget > 0.8

# Total proxy spend
sum(litellm_spend_total)

# Spend by model
sum by (model) (litellm_spend_total)

Best Practices

1. Set Conservative Budgets

Start with lower budgets and increase as needed:
{
  "max_budget": 10.0,      // Start small
  "soft_budget": 8.0,      // Early warning
  "duration": "7d"         // Short period
}

2. Use Soft Budgets

Always set soft budgets for early warnings:
{
  "max_budget": 100.0,
  "soft_budget": 80.0     // 80% threshold
}

3. Monitor Regularly

Check spend daily or weekly:
# Daily spend check
curl -X GET 'http://localhost:4000/spend/keys' \
  -H 'Authorization: Bearer sk-1234' \
  | jq '.[] | select(.spend > .max_budget * 0.8)'

4. Team Budgets

Use team budgets for organizational cost allocation:
# Engineering team
curl -X POST 'http://localhost:4000/team/new' \
  -d '{
    "team_alias": "engineering",
    "max_budget": 5000.0,
    "budget_duration": "30d"
  }'

# Marketing team
curl -X POST 'http://localhost:4000/team/new' \
  -d '{
    "team_alias": "marketing",
    "max_budget": 1000.0,
    "budget_duration": "30d"
  }'

5. Metadata for Cost Tracking

Use metadata to track costs by project:
{
  "metadata": {
    "project": "chatbot-v2",
    "cost_center": "engineering",
    "environment": "production"
  }
}

Troubleshooting

Alerts Not Firing

Check configuration:
# Verify alerting is enabled
curl -X GET 'http://localhost:4000/config' \
  -H 'Authorization: Bearer sk-1234' \
  | jq '.litellm_settings.alerting'

# Test Slack webhook
curl -X POST $SLACK_WEBHOOK_URL \
  -H 'Content-Type: application/json' \
  -d '{"text": "Test alert"}'

Incorrect Spend Tracking

Verify database is configured:
config.yaml
general_settings:
  database_url: os.environ/DATABASE_URL
  store_model_in_db: true

Budget Not Resetting

Check budget reset schedule:
curl -X GET 'http://localhost:4000/key/info' \
  -H 'Authorization: Bearer sk-1234567890abcdef' \
  | jq '.budget_reset_at'

Next Steps

Virtual Keys

Learn about key management

Configuration

Advanced configuration options

Docker Deployment

Deploy in production

Quick Start

Get started guide

Build docs developers (and LLMs) love