Skip to main content

Overview

Virtual keys allow you to create API keys with:
  • Custom budgets and rate limits
  • Model access restrictions
  • Expiration dates
  • Team associations
  • Metadata and tags

Generate a Key

Create a virtual key using the master key:
curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "models": ["gpt-3.5-turbo", "gpt-4"],
    "max_budget": 10.0,
    "duration": "30d"
  }'
Response:
{
  "key": "sk-1234567890abcdef",
  "key_name": null,
  "expires": "2024-04-15T10:30:00Z",
  "models": ["gpt-3.5-turbo", "gpt-4"],
  "max_budget": 10.0,
  "budget_duration": "30d",
  "budget_reset_at": "2024-04-15T10:30:00Z"
}

Key Generation Parameters

Basic Parameters

{
  "key_name": "production-api-key",      // Optional friendly name
  "duration": "30d",                     // Key expiration (e.g., 30d, 24h, null for no expiry)
  "models": ["gpt-3.5-turbo", "gpt-4"], // Allowed models
  "metadata": {                          // Custom metadata
    "environment": "production",
    "team": "backend"
  }
}

Budget Parameters

{
  "max_budget": 100.0,           // Maximum spend in USD
  "budget_duration": "30d",      // Budget reset period
  "soft_budget": 80.0            // Alert threshold (80% of max_budget)
}

Rate Limiting

{
  "rpm": 100,        // Requests per minute
  "tpm": 100000,     // Tokens per minute
  "max_parallel_requests": 10
}

Team Association

{
  "team_id": "team-abc-123",
  "user_id": "user-xyz-456"
}

Complete Example

curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "key_name": "production-backend",
    "duration": "90d",
    "models": ["gpt-3.5-turbo", "gpt-4"],
    "max_budget": 100.0,
    "budget_duration": "30d",
    "soft_budget": 80.0,
    "rpm": 100,
    "tpm": 100000,
    "metadata": {
      "environment": "production",
      "team": "backend"
    }
  }'

Get Key Information

Retrieve information about a key:
curl -X GET 'http://localhost:4000/key/info' \
  -H 'Authorization: Bearer sk-1234567890abcdef'
Response:
{
  "key": "sk-1234...def",
  "key_name": "production-backend",
  "team_id": null,
  "max_budget": 100.0,
  "spend": 45.23,
  "budget_reset_at": "2024-04-15T10:30:00Z",
  "models": ["gpt-3.5-turbo", "gpt-4"],
  "rpm": 100,
  "tpm": 100000,
  "expires": "2024-07-15T10:30:00Z",
  "metadata": {
    "environment": "production",
    "team": "backend"
  }
}

Update a Key

Modify key properties:
curl -X POST 'http://localhost:4000/key/update' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "key": "sk-1234567890abcdef",
    "max_budget": 200.0,
    "models": ["gpt-3.5-turbo", "gpt-4", "claude-3-opus"],
    "rpm": 200
  }'
You can only update a key using the master key, not the key itself.

Delete a Key

Revoke a virtual key:
curl -X POST 'http://localhost:4000/key/delete' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "keys": ["sk-1234567890abcdef"]
  }'

List All Keys

Get all virtual keys:
curl -X GET 'http://localhost:4000/key/list' \
  -H 'Authorization: Bearer sk-1234'

Key Auto-Rotation

Configure automatic key rotation:
curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "key_alias": "production-key",
    "auto_rotate": true,
    "rotation_interval": "90d",
    "models": ["gpt-3.5-turbo"],
    "max_budget": 100.0
  }'
The key will automatically rotate every 90 days. The key_alias remains constant while the underlying key changes.

Budget Tracking

Check Spend

Monitor key spending:
curl -X GET 'http://localhost:4000/key/info' \
  -H 'Authorization: Bearer sk-1234567890abcdef'
The response includes:
  • spend: Current spend
  • max_budget: Budget limit
  • budget_reset_at: When budget resets

Budget Alerts

Set soft budget for alerts:
{
  "max_budget": 100.0,
  "soft_budget": 80.0  // Alert at 80% usage
}
Configure webhook for alerts in your config:
config.yaml
litellm_settings:
  alerting:
    - slack
  alerting_threshold: 0.8  # Alert at 80% budget
  slack_webhook_url: os.environ/SLACK_WEBHOOK_URL

Model Access Control

Restrict to Specific Models

{
  "models": ["gpt-3.5-turbo", "gpt-4"]
}
Requests to other models will be rejected:
{
  "error": {
    "message": "API key does not have access to model: claude-3-opus",
    "type": "invalid_request_error"
  }
}

Allow All Models

Omit the models parameter or use null:
{
  "models": null  // Access to all configured models
}

Rate Limiting

Per-Key Rate Limits

{
  "rpm": 100,        // 100 requests per minute
  "tpm": 100000,     // 100k tokens per minute
  "max_parallel_requests": 10  // Max concurrent requests
}
When rate limit is exceeded:
{
  "error": {
    "message": "Rate limit exceeded. Retry after 60 seconds.",
    "type": "rate_limit_error"
  }
}

Team Keys

Generate keys associated with teams:
1

Create a Team

curl -X POST 'http://localhost:4000/team/new' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "team_alias": "engineering",
    "max_budget": 1000.0,
    "budget_duration": "30d"
  }'
2

Generate Team Key

curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "team_id": "team-abc-123",
    "models": ["gpt-3.5-turbo"],
    "max_budget": 100.0
  }'
Team keys inherit team budgets and settings. The key budget is separate from the team budget.

Key Metadata

Attach custom metadata to keys:
{
  "metadata": {
    "environment": "production",
    "service": "backend-api",
    "owner": "[email protected]",
    "cost_center": "engineering"
  }
}
Use metadata for:
  • Cost allocation
  • Usage tracking
  • Access auditing
  • Organizational reporting

Security Best Practices

1. Master Key Protection

Never expose the master key in client applications. Use virtual keys instead.
# Store master key securely
export LITELLM_MASTER_KEY=$(cat /secure/path/master_key.txt)

2. Key Rotation

Rotate keys regularly:
# Generate new key
curl -X POST 'http://localhost:4000/key/generate' ...

# Update applications
# Delete old key
curl -X POST 'http://localhost:4000/key/delete' \
  -H 'Authorization: Bearer sk-1234' \
  -d '{"keys": ["old-key"]}'

3. Principle of Least Privilege

Grant minimum required access:
{
  "models": ["gpt-3.5-turbo"],  // Only specific model
  "max_budget": 10.0,            // Low budget
  "duration": "7d",              // Short expiration
  "rpm": 10                      // Low rate limit
}

4. Monitor Usage

Regularly audit key usage:
# List all keys
curl -X GET 'http://localhost:4000/key/list' \
  -H 'Authorization: Bearer sk-1234'

# Check spend
curl -X GET 'http://localhost:4000/spend/keys' \
  -H 'Authorization: Bearer sk-1234'

Programmatic Key Management

import requests

class LiteLLMKeyManager:
    def __init__(self, base_url, master_key):
        self.base_url = base_url
        self.headers = {
            'Authorization': f'Bearer {master_key}',
            'Content-Type': 'application/json'
        }
    
    def create_key(self, **kwargs):
        response = requests.post(
            f'{self.base_url}/key/generate',
            headers=self.headers,
            json=kwargs
        )
        return response.json()
    
    def delete_key(self, key):
        response = requests.post(
            f'{self.base_url}/key/delete',
            headers=self.headers,
            json={'keys': [key]}
        )
        return response.json()
    
    def get_key_info(self, key):
        response = requests.get(
            f'{self.base_url}/key/info',
            headers={'Authorization': f'Bearer {key}'}
        )
        return response.json()

# Usage
manager = LiteLLMKeyManager(
    base_url='http://localhost:4000',
    master_key='sk-1234'
)

# Create key
key = manager.create_key(
    models=['gpt-3.5-turbo'],
    max_budget=10.0,
    duration='30d'
)
print(f"Created: {key['key']}")

# Get info
info = manager.get_key_info(key['key'])
print(f"Spend: ${info['spend']}")

# Delete key
manager.delete_key(key['key'])

Next Steps

Budget Alerts

Set up spending alerts and notifications

Configuration

Advanced proxy configuration

Quick Start

Get started with the proxy

Docker Deployment

Deploy in production

Build docs developers (and LLMs) love