Budget Alerts & Monitoring

Overview

LiteLLM Proxy provides comprehensive budget tracking and alerting capabilities:

Budget Enforcement: Hard limits to prevent overspending
Soft Budgets: Warnings before hitting limits
Webhook Alerts: Real-time notifications
Slack Integration: Team notifications
Spend Tracking: Monitor usage across keys, users, and teams

Budget Types

Global Budget

Set a budget for the entire proxy:

config.yaml

litellm_settings:
  max_budget: 1000.0       # Maximum spend in USD
  budget_duration: 30d     # Budget reset period

When the global budget is exceeded, all requests are rejected:

{
  "error": {
    "message": "Proxy budget exceeded. Max budget: $1000.00",
    "type": "budget_exceeded"
  }
}

Key Budget

Set budgets per virtual key:

curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "models": ["gpt-3.5-turbo"],
    "max_budget": 100.0,
    "budget_duration": "30d"
  }'

Team Budget

Set budgets for teams:

curl -X POST 'http://localhost:4000/team/new' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "team_alias": "engineering",
    "max_budget": 1000.0,
    "budget_duration": "30d"
  }'

User Budget

Set budgets per user:

curl -X POST 'http://localhost:4000/user/new' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "user_id": "[email protected]",
    "max_budget": 50.0,
    "budget_duration": "30d"
  }'

Soft Budgets (Alerts)

Soft budgets trigger alerts without blocking requests:

curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "models": ["gpt-3.5-turbo"],
    "max_budget": 100.0,
    "soft_budget": 80.0,     # Alert at $80
    "budget_duration": "30d"
  }'

Set Soft Budget

Configure soft budget threshold (80% of max budget)

Monitor Usage

Proxy tracks spending in real-time

Receive Alert

Webhook/Slack notification when soft budget is exceeded

Take Action

Increase budget or optimize usage before hitting hard limit

Slack Alerts

Configure Slack notifications for budget alerts:

Setup

Create Slack Webhook

Go to https://api.slack.com/apps
Create a new app
Enable Incoming Webhooks
Create a webhook URL

Configure Proxy

Add Slack settings to your config:

config.yaml

litellm_settings:
  alerting: ["slack"]
  alerting_threshold: 0.8  # Alert at 80% budget
  
  # Slack webhook URL
  slack_webhook_url: os.environ/SLACK_WEBHOOK_URL
  
  # Optional: Customize alert message
  slack_alert_to_webhook_url: os.environ/SLACK_WEBHOOK_URL

Set Environment Variable

export SLACK_WEBHOOK_URL="https://hooks.slack.com/services/..."

Restart Proxy

litellm --config config.yaml

Alert Example

When a budget threshold is reached, Slack receives:

🚨 Budget Alert

Key: sk-1234...abcd
Spend: $82.45 / $100.00 (82%)
Budget Reset: 2024-04-15 10:30:00 UTC

Consider increasing budget or optimizing usage.

Webhook Alerts

Send budget alerts to custom webhooks:

config.yaml

litellm_settings:
  alerting: ["webhook"]
  alerting_threshold: 0.8
  
  # Custom webhook URL
  webhook_url: https://example.com/alerts

Webhook payload:

{
  "event": "budget_alert",
  "timestamp": "2024-04-15T10:30:00Z",
  "key": "sk-1234...abcd",
  "key_alias": "production-key",
  "spend": 82.45,
  "max_budget": 100.0,
  "budget_threshold": 80.0,
  "percentage": 0.8245,
  "budget_reset_at": "2024-05-15T10:30:00Z",
  "metadata": {
    "environment": "production",
    "team": "backend"
  }
}

Email Alerts

Configure email notifications:

config.yaml

litellm_settings:
  alerting: ["email"]
  alerting_threshold: 0.8
  
  # Email settings
  email_alerts_to: ["[email protected]", "[email protected]"]
  
  # SMTP configuration
  smtp_host: os.environ/SMTP_HOST
  smtp_port: 587
  smtp_user: os.environ/SMTP_USER
  smtp_password: os.environ/SMTP_PASSWORD
  smtp_sender: [email protected]

Required environment variables:

export SMTP_HOST="smtp.gmail.com"
export SMTP_USER="[email protected]"
export SMTP_PASSWORD="your-app-password"

Budget Reset Schedule

Budgets automatically reset based on the configured duration:

max_budget: 100.0
budget_duration: 1d

Manual Budget Reset

Reset a key’s budget manually:

curl -X POST 'http://localhost:4000/key/update' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "key": "sk-1234567890abcdef",
    "spend": 0.0  # Reset spend to 0
  }'

Spend Tracking

View Key Spend

curl -X GET 'http://localhost:4000/key/info' \
  -H 'Authorization: Bearer sk-1234567890abcdef'

Response:

{
  "key": "sk-1234...def",
  "spend": 45.23,
  "max_budget": 100.0,
  "budget_reset_at": "2024-04-15T10:30:00Z",
  "models": ["gpt-3.5-turbo"]
}

View All Keys Spend

curl -X GET 'http://localhost:4000/spend/keys' \
  -H 'Authorization: Bearer sk-1234'

View Team Spend

curl -X GET 'http://localhost:4000/spend/teams' \
  -H 'Authorization: Bearer sk-1234'

View User Spend

curl -X GET 'http://localhost:4000/spend/users' \
  -H 'Authorization: Bearer sk-1234'

Projected Spend Alerts

Alert based on projected monthly spend:

config.yaml

general_settings:
  projected_spend_alerts: true
  projected_spend_threshold: 0.9  # Alert at 90% of projected

litellm_settings:
  alerting: ["slack"]
  slack_webhook_url: os.environ/SLACK_WEBHOOK_URL

If current usage trends suggest exceeding budget, alerts are sent proactively.

Advanced Alerting

Multi-Channel Alerts

Send to multiple channels:

config.yaml

litellm_settings:
  alerting: ["slack", "webhook", "email"]
  alerting_threshold: 0.8
  
  slack_webhook_url: os.environ/SLACK_WEBHOOK_URL
  webhook_url: https://example.com/alerts
  email_alerts_to: ["[email protected]"]

Team-Specific Alerts

Configure different alert settings per team:

config.yaml

litellm_settings:
  default_team_settings:
    - team_id: team-engineering
      alerting: ["slack"]
      alerting_threshold: 0.8
      slack_webhook_url: os.environ/ENGINEERING_SLACK_WEBHOOK
    
    - team_id: team-marketing
      alerting: ["email"]
      alerting_threshold: 0.9
      email_alerts_to: ["[email protected]"]

Custom Alert Thresholds

config.yaml

litellm_settings:
  # Alert at multiple thresholds
  alerting_thresholds: [0.5, 0.75, 0.9, 1.0]
  
  alerting: ["slack"]
  slack_webhook_url: os.environ/SLACK_WEBHOOK_URL

Alerts are sent at 50%, 75%, 90%, and 100% of budget.

Monitoring Dashboard

Admin UI

Access the admin dashboard at http://localhost:4000/ui:

Overview: Total spend across all keys
Keys: Individual key spend and budgets
Teams: Team-level spend tracking
Users: Per-user spend analysis
Charts: Spend trends over time

Prometheus Metrics

Export budget metrics to Prometheus:

config.yaml

litellm_settings:
  success_callback: ["prometheus"]

Metrics available:

litellm_spend_total - Total spend
litellm_key_spend - Per-key spend
litellm_team_spend - Per-team spend
litellm_budget_remaining - Remaining budget

Query in Prometheus:

# Keys approaching budget limit
litellm_key_spend / litellm_key_budget > 0.8

# Total proxy spend
sum(litellm_spend_total)

# Spend by model
sum by (model) (litellm_spend_total)

Best Practices

1. Set Conservative Budgets

Start with lower budgets and increase as needed:

{
  "max_budget": 10.0,      // Start small
  "soft_budget": 8.0,      // Early warning
  "duration": "7d"         // Short period
}

2. Use Soft Budgets

Always set soft budgets for early warnings:

{
  "max_budget": 100.0,
  "soft_budget": 80.0     // 80% threshold
}

3. Monitor Regularly

Check spend daily or weekly:

# Daily spend check
curl -X GET 'http://localhost:4000/spend/keys' \
  -H 'Authorization: Bearer sk-1234' \
  | jq '.[] | select(.spend > .max_budget * 0.8)'

4. Team Budgets

Use team budgets for organizational cost allocation:

# Engineering team
curl -X POST 'http://localhost:4000/team/new' \
  -d '{
    "team_alias": "engineering",
    "max_budget": 5000.0,
    "budget_duration": "30d"
  }'

# Marketing team
curl -X POST 'http://localhost:4000/team/new' \
  -d '{
    "team_alias": "marketing",
    "max_budget": 1000.0,
    "budget_duration": "30d"
  }'

5. Metadata for Cost Tracking

Use metadata to track costs by project:

{
  "metadata": {
    "project": "chatbot-v2",
    "cost_center": "engineering",
    "environment": "production"
  }
}

Troubleshooting

Alerts Not Firing

Check configuration:

# Verify alerting is enabled
curl -X GET 'http://localhost:4000/config' \
  -H 'Authorization: Bearer sk-1234' \
  | jq '.litellm_settings.alerting'

# Test Slack webhook
curl -X POST $SLACK_WEBHOOK_URL \
  -H 'Content-Type: application/json' \
  -d '{"text": "Test alert"}'

Incorrect Spend Tracking

Verify database is configured:

config.yaml

general_settings:
  database_url: os.environ/DATABASE_URL
  store_model_in_db: true

Budget Not Resetting

Check budget reset schedule:

curl -X GET 'http://localhost:4000/key/info' \
  -H 'Authorization: Bearer sk-1234567890abcdef' \
  | jq '.budget_reset_at'

Next Steps

Virtual Keys

Learn about key management

Configuration

Advanced configuration options

Docker Deployment

Deploy in production

Quick Start

Get started guide

Get Started

Python SDK

AI Gateway (Proxy)

Core Features

Advanced

​Overview

​Budget Types

​Global Budget

​Key Budget

​Team Budget

​User Budget

​Soft Budgets (Alerts)

​Slack Alerts

​Setup

​Alert Example

​Webhook Alerts

​Email Alerts

​Budget Reset Schedule

​Manual Budget Reset

​Spend Tracking

​View Key Spend

​View All Keys Spend

​View Team Spend

​View User Spend

​Projected Spend Alerts

​Advanced Alerting

​Multi-Channel Alerts

​Team-Specific Alerts

​Custom Alert Thresholds

​Monitoring Dashboard

​Admin UI

​Prometheus Metrics

​Best Practices

​1. Set Conservative Budgets

​2. Use Soft Budgets

​3. Monitor Regularly

​4. Team Budgets

​5. Metadata for Cost Tracking

​Troubleshooting

​Alerts Not Firing

​Incorrect Spend Tracking

​Budget Not Resetting

​Next Steps

Virtual Keys

Configuration

Docker Deployment

Quick Start

Build docs developers (and LLMs) love

Overview

Budget Types

Global Budget

Key Budget

Team Budget

User Budget

Soft Budgets (Alerts)

Slack Alerts

Setup

Alert Example

Webhook Alerts

Email Alerts

Budget Reset Schedule

Manual Budget Reset

Spend Tracking

View Key Spend

View All Keys Spend

View Team Spend

View User Spend

Projected Spend Alerts

Advanced Alerting

Multi-Channel Alerts

Team-Specific Alerts

Custom Alert Thresholds

Monitoring Dashboard

Admin UI

Prometheus Metrics

Best Practices

1. Set Conservative Budgets

2. Use Soft Budgets

3. Monitor Regularly

4. Team Budgets

5. Metadata for Cost Tracking

Troubleshooting

Alerts Not Firing

Incorrect Spend Tracking

Budget Not Resetting

Next Steps