Virtual Keys (API Key Management)

Overview

Virtual keys allow you to create API keys with:

Custom budgets and rate limits
Model access restrictions
Expiration dates
Team associations
Metadata and tags

Generate a Key

Create a virtual key using the master key:

curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "models": ["gpt-3.5-turbo", "gpt-4"],
    "max_budget": 10.0,
    "duration": "30d"
  }'

Response:

{
  "key": "sk-1234567890abcdef",
  "key_name": null,
  "expires": "2024-04-15T10:30:00Z",
  "models": ["gpt-3.5-turbo", "gpt-4"],
  "max_budget": 10.0,
  "budget_duration": "30d",
  "budget_reset_at": "2024-04-15T10:30:00Z"
}

Key Generation Parameters

Basic Parameters

{
  "key_name": "production-api-key",      // Optional friendly name
  "duration": "30d",                     // Key expiration (e.g., 30d, 24h, null for no expiry)
  "models": ["gpt-3.5-turbo", "gpt-4"], // Allowed models
  "metadata": {                          // Custom metadata
    "environment": "production",
    "team": "backend"
  }
}

Budget Parameters

{
  "max_budget": 100.0,           // Maximum spend in USD
  "budget_duration": "30d",      // Budget reset period
  "soft_budget": 80.0            // Alert threshold (80% of max_budget)
}

Rate Limiting

{
  "rpm": 100,        // Requests per minute
  "tpm": 100000,     // Tokens per minute
  "max_parallel_requests": 10
}

Team Association

{
  "team_id": "team-abc-123",
  "user_id": "user-xyz-456"
}

Complete Example

curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "key_name": "production-backend",
    "duration": "90d",
    "models": ["gpt-3.5-turbo", "gpt-4"],
    "max_budget": 100.0,
    "budget_duration": "30d",
    "soft_budget": 80.0,
    "rpm": 100,
    "tpm": 100000,
    "metadata": {
      "environment": "production",
      "team": "backend"
    }
  }'

Get Key Information

Retrieve information about a key:

curl -X GET 'http://localhost:4000/key/info' \
  -H 'Authorization: Bearer sk-1234567890abcdef'

Response:

{
  "key": "sk-1234...def",
  "key_name": "production-backend",
  "team_id": null,
  "max_budget": 100.0,
  "spend": 45.23,
  "budget_reset_at": "2024-04-15T10:30:00Z",
  "models": ["gpt-3.5-turbo", "gpt-4"],
  "rpm": 100,
  "tpm": 100000,
  "expires": "2024-07-15T10:30:00Z",
  "metadata": {
    "environment": "production",
    "team": "backend"
  }
}

Update a Key

Modify key properties:

curl -X POST 'http://localhost:4000/key/update' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "key": "sk-1234567890abcdef",
    "max_budget": 200.0,
    "models": ["gpt-3.5-turbo", "gpt-4", "claude-3-opus"],
    "rpm": 200
  }'

You can only update a key using the master key, not the key itself.

Delete a Key

Revoke a virtual key:

curl -X POST 'http://localhost:4000/key/delete' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "keys": ["sk-1234567890abcdef"]
  }'

List All Keys

Get all virtual keys:

curl -X GET 'http://localhost:4000/key/list' \
  -H 'Authorization: Bearer sk-1234'

Key Auto-Rotation

Configure automatic key rotation:

curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "key_alias": "production-key",
    "auto_rotate": true,
    "rotation_interval": "90d",
    "models": ["gpt-3.5-turbo"],
    "max_budget": 100.0
  }'

The key will automatically rotate every 90 days. The key_alias remains constant while the underlying key changes.

Budget Tracking

Check Spend

Monitor key spending:

curl -X GET 'http://localhost:4000/key/info' \
  -H 'Authorization: Bearer sk-1234567890abcdef'

The response includes:

spend: Current spend
max_budget: Budget limit
budget_reset_at: When budget resets

Budget Alerts

Set soft budget for alerts:

{
  "max_budget": 100.0,
  "soft_budget": 80.0  // Alert at 80% usage
}

Configure webhook for alerts in your config:

config.yaml

litellm_settings:
  alerting:
    - slack
  alerting_threshold: 0.8  # Alert at 80% budget
  slack_webhook_url: os.environ/SLACK_WEBHOOK_URL

Model Access Control

Restrict to Specific Models

{
  "models": ["gpt-3.5-turbo", "gpt-4"]
}

Requests to other models will be rejected:

{
  "error": {
    "message": "API key does not have access to model: claude-3-opus",
    "type": "invalid_request_error"
  }
}

Allow All Models

Omit the models parameter or use null:

{
  "models": null  // Access to all configured models
}

Rate Limiting

Per-Key Rate Limits

{
  "rpm": 100,        // 100 requests per minute
  "tpm": 100000,     // 100k tokens per minute
  "max_parallel_requests": 10  // Max concurrent requests
}

When rate limit is exceeded:

{
  "error": {
    "message": "Rate limit exceeded. Retry after 60 seconds.",
    "type": "rate_limit_error"
  }
}

Team Keys

Generate keys associated with teams:

Create a Team

curl -X POST 'http://localhost:4000/team/new' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "team_alias": "engineering",
    "max_budget": 1000.0,
    "budget_duration": "30d"
  }'

Generate Team Key

curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "team_id": "team-abc-123",
    "models": ["gpt-3.5-turbo"],
    "max_budget": 100.0
  }'

Team keys inherit team budgets and settings. The key budget is separate from the team budget.

Key Metadata

Attach custom metadata to keys:

{
  "metadata": {
    "environment": "production",
    "service": "backend-api",
    "owner": "[email protected]",
    "cost_center": "engineering"
  }
}

Use metadata for:

Cost allocation
Usage tracking
Access auditing
Organizational reporting

Security Best Practices

1. Master Key Protection

Never expose the master key in client applications. Use virtual keys instead.

# Store master key securely
export LITELLM_MASTER_KEY=$(cat /secure/path/master_key.txt)

2. Key Rotation

Rotate keys regularly:

# Generate new key
curl -X POST 'http://localhost:4000/key/generate' ...

# Update applications
# Delete old key
curl -X POST 'http://localhost:4000/key/delete' \
  -H 'Authorization: Bearer sk-1234' \
  -d '{"keys": ["old-key"]}'

3. Principle of Least Privilege

Grant minimum required access:

{
  "models": ["gpt-3.5-turbo"],  // Only specific model
  "max_budget": 10.0,            // Low budget
  "duration": "7d",              // Short expiration
  "rpm": 10                      // Low rate limit
}

4. Monitor Usage

Regularly audit key usage:

# List all keys
curl -X GET 'http://localhost:4000/key/list' \
  -H 'Authorization: Bearer sk-1234'

# Check spend
curl -X GET 'http://localhost:4000/spend/keys' \
  -H 'Authorization: Bearer sk-1234'

Programmatic Key Management

import requests

class LiteLLMKeyManager:
    def __init__(self, base_url, master_key):
        self.base_url = base_url
        self.headers = {
            'Authorization': f'Bearer {master_key}',
            'Content-Type': 'application/json'
        }
    
    def create_key(self, **kwargs):
        response = requests.post(
            f'{self.base_url}/key/generate',
            headers=self.headers,
            json=kwargs
        )
        return response.json()
    
    def delete_key(self, key):
        response = requests.post(
            f'{self.base_url}/key/delete',
            headers=self.headers,
            json={'keys': [key]}
        )
        return response.json()
    
    def get_key_info(self, key):
        response = requests.get(
            f'{self.base_url}/key/info',
            headers={'Authorization': f'Bearer {key}'}
        )
        return response.json()

# Usage
manager = LiteLLMKeyManager(
    base_url='http://localhost:4000',
    master_key='sk-1234'
)

# Create key
key = manager.create_key(
    models=['gpt-3.5-turbo'],
    max_budget=10.0,
    duration='30d'
)
print(f"Created: {key['key']}")

# Get info
info = manager.get_key_info(key['key'])
print(f"Spend: ${info['spend']}")

# Delete key
manager.delete_key(key['key'])

Next Steps

Budget Alerts

Set up spending alerts and notifications

Configuration

Advanced proxy configuration

Quick Start

Get started with the proxy

Docker Deployment

Deploy in production

Get Started

Python SDK

AI Gateway (Proxy)

Core Features

Advanced

​Overview

​Generate a Key

​Key Generation Parameters

​Basic Parameters

​Budget Parameters

​Rate Limiting

​Team Association

​Complete Example

​Get Key Information

​Update a Key

​Delete a Key

​List All Keys

​Key Auto-Rotation

​Budget Tracking

​Check Spend

​Budget Alerts

​Model Access Control

​Restrict to Specific Models

​Allow All Models

​Rate Limiting

​Per-Key Rate Limits

​Team Keys

​Key Metadata

​Security Best Practices

​1. Master Key Protection

​2. Key Rotation

​3. Principle of Least Privilege

​4. Monitor Usage

​Programmatic Key Management

​Next Steps

Budget Alerts

Configuration

Quick Start

Docker Deployment

Build docs developers (and LLMs) love

Overview

Generate a Key

Key Generation Parameters

Basic Parameters

Budget Parameters

Rate Limiting

Team Association

Complete Example

Get Key Information

Update a Key

Delete a Key

List All Keys

Key Auto-Rotation

Budget Tracking

Check Spend

Budget Alerts

Model Access Control

Restrict to Specific Models

Allow All Models

Rate Limiting

Per-Key Rate Limits

Team Keys

Key Metadata

Security Best Practices

1. Master Key Protection

2. Key Rotation

3. Principle of Least Privilege

4. Monitor Usage

Programmatic Key Management

Next Steps