Security Best Practices

Authentication and Authorization

Master Key

Always set a strong LITELLM_MASTER_KEY in production. This key has full admin access.

Generate a secure master key:

# Generate cryptographically secure key
openssl rand -hex 32

# Or use Python
python -c "import secrets; print('sk-' + secrets.token_hex(32))"

Set in environment:

LITELLM_MASTER_KEY=sk-a7f9c8e2d1b4f6a8c9e3d2b5f7a9c1e4d3b6f8a2c5e7d9b4f6a8c1e3d5b7f9a2

Or in config:

config.yaml

general_settings:
  master_key: os.environ/LITELLM_MASTER_KEY

Virtual Keys

Create scoped API keys with rate limits and budgets:

curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "key_alias": "frontend-app",
    "models": ["gpt-4o", "claude-sonnet-4"],
    "max_budget": 100.0,
    "budget_duration": "30d",
    "rpm_limit": 100,
    "tpm_limit": 100000,
    "team_id": "team-engineering",
    "metadata": {
      "environment": "production",
      "app": "chat-widget"
    }
  }'

Response:

{
  "key": "sk-litellm-a1b2c3d4...",
  "key_name": "frontend-app",
  "expires": null,
  "models": ["gpt-4o", "claude-sonnet-4"],
  "max_budget": 100.0,
  "budget_duration": "30d"
}

Virtual keys are hashed before storage. The actual key is only shown once during creation.

Team-Based Access Control

Create teams with budgets:

curl -X POST http://localhost:4000/team/new \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "team_alias": "engineering",
    "models": ["gpt-4o", "claude-sonnet-4"],
    "max_budget": 1000.0,
    "budget_duration": "30d",
    "rpm_limit": 500,
    "tpm_limit": 500000
  }'

Assign users to teams:

curl -X POST http://localhost:4000/team/member/add \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "team_id": "team-123",
    "user_id": "[email protected]",
    "role": "member"
  }'

SSO Integration

Google OAuth
Microsoft Azure AD
Okta

config.yaml

general_settings:
  ui_access_mode: sso
  sso_providers:
    - provider: google
      client_id: os.environ/GOOGLE_CLIENT_ID
      client_secret: os.environ/GOOGLE_CLIENT_SECRET
      allowed_domains:
        - company.com
        - subsidiary.com

Set environment variables:

GOOGLE_CLIENT_ID=your-client-id.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=your-secret

config.yaml

general_settings:
  ui_access_mode: sso
  sso_providers:
    - provider: azure
      client_id: os.environ/AZURE_CLIENT_ID
      client_secret: os.environ/AZURE_CLIENT_SECRET
      tenant_id: os.environ/AZURE_TENANT_ID

AZURE_CLIENT_ID=your-application-id
AZURE_CLIENT_SECRET=your-secret
AZURE_TENANT_ID=your-tenant-id

config.yaml

general_settings:
  ui_access_mode: sso
  sso_providers:
    - provider: okta
      client_id: os.environ/OKTA_CLIENT_ID
      client_secret: os.environ/OKTA_CLIENT_SECRET
      issuer: https://your-domain.okta.com

Secrets Management

Environment Variables

Never commit API keys to Git. Use environment variables or secrets management systems.

Reference in config.yaml:

model_list:
  - model_name: gpt-4o
    litellm_params:
      model: gpt-4o
      api_key: os.environ/OPENAI_API_KEY  # ✅ Good
      # api_key: sk-proj-xxx  # ❌ Bad - never hardcode
  
  - model_name: azure-gpt4
    litellm_params:
      model: azure/gpt-4
      api_key: os.environ/AZURE_API_KEY
      api_base: os.environ/AZURE_API_BASE

general_settings:
  master_key: os.environ/LITELLM_MASTER_KEY
  database_url: os.environ/DATABASE_URL

Kubernetes Secrets

Create secrets:

kubectl create secret generic litellm-secrets \
  --from-literal=LITELLM_MASTER_KEY=sk-... \
  --from-literal=OPENAI_API_KEY=sk-proj-... \
  --from-literal=ANTHROPIC_API_KEY=sk-ant-... \
  --from-literal=DATABASE_URL=postgresql://...

Mount in deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: litellm
spec:
  template:
    spec:
      containers:
        - name: litellm
          envFrom:
            - secretRef:
                name: litellm-secrets
          # Or individual env vars
          env:
            - name: LITELLM_MASTER_KEY
              valueFrom:
                secretKeyRef:
                  name: litellm-secrets
                  key: LITELLM_MASTER_KEY

AWS Secrets Manager

Store secrets in AWS:

aws secretsmanager create-secret \
  --name litellm/prod/api-keys \
  --secret-string '{
    "OPENAI_API_KEY": "sk-proj-...",
    "ANTHROPIC_API_KEY": "sk-ant-..."
  }'

Retrieve in application:

import boto3
import json
import os

def load_secrets():
    client = boto3.client('secretsmanager', region_name='us-east-1')
    secret = client.get_secret_value(SecretId='litellm/prod/api-keys')
    secrets = json.loads(secret['SecretString'])
    
    for key, value in secrets.items():
        os.environ[key] = value

load_secrets()

HashiCorp Vault

config.yaml

general_settings:
  secret_manager:
    type: vault
    vault_address: https://vault.company.com
    vault_token: os.environ/VAULT_TOKEN
    vault_mount_path: secret/litellm

model_list:
  - model_name: gpt-4o
    litellm_params:
      model: gpt-4o
      api_key: vault/openai/api_key  # Fetches from Vault

Network Security

TLS/SSL Configuration

Enable HTTPS in production:

NGINX Reverse Proxy
Traefik
Kubernetes Ingress

server {
    listen 443 ssl http2;
    server_name api.yourdomain.com;
    
    ssl_certificate /etc/nginx/ssl/cert.pem;
    ssl_certificate_key /etc/nginx/ssl/key.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;
    
    location / {
        proxy_pass http://litellm:4000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Security headers
        add_header X-Content-Type-Options nosniff;
        add_header X-Frame-Options DENY;
        add_header X-XSS-Protection "1; mode=block";
        add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
    }
}

http:
  routers:
    litellm:
      rule: "Host(`api.yourdomain.com`)"
      tls:
        certResolver: letsencrypt
      service: litellm
  
  services:
    litellm:
      loadBalancer:
        servers:
          - url: "http://litellm:4000"

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: litellm-ingress
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
spec:
  tls:
    - hosts:
        - api.yourdomain.com
      secretName: litellm-tls
  rules:
    - host: api.yourdomain.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: litellm
                port:
                  number: 4000

Firewall Rules

Allow only necessary traffic:

# AWS Security Group
aws ec2 authorize-security-group-ingress \
  --group-id sg-xxx \
  --protocol tcp \
  --port 443 \
  --cidr 0.0.0.0/0  # HTTPS from anywhere

aws ec2 authorize-security-group-ingress \
  --group-id sg-xxx \
  --protocol tcp \
  --port 5432 \
  --source-group sg-litellm  # PostgreSQL only from LiteLLM

GCP Firewall:

gcloud compute firewall-rules create allow-https \
  --allow tcp:443 \
  --source-ranges 0.0.0.0/0 \
  --target-tags litellm

gcloud compute firewall-rules create allow-postgres \
  --allow tcp:5432 \
  --source-tags litellm \
  --target-tags postgres

Network Policies (Kubernetes)

networkpolicy.yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: litellm-netpol
spec:
  podSelector:
    matchLabels:
      app: litellm
  policyTypes:
    - Ingress
    - Egress
  ingress:
    # Allow from ingress controller
    - from:
        - namespaceSelector:
            matchLabels:
              name: ingress-nginx
      ports:
        - protocol: TCP
          port: 4000
  egress:
    # Allow to database
    - to:
        - podSelector:
            matchLabels:
              app: postgresql
      ports:
        - protocol: TCP
          port: 5432
    # Allow to Redis
    - to:
        - podSelector:
            matchLabels:
              app: redis
      ports:
        - protocol: TCP
          port: 6379
    # Allow external API calls (OpenAI, Anthropic, etc.)
    - to:
        - namespaceSelector: {}
      ports:
        - protocol: TCP
          port: 443
    # Allow DNS
    - to:
        - namespaceSelector: {}
      ports:
        - protocol: UDP
          port: 53

Rate Limiting and DDoS Protection

Built-in Rate Limiting

Global rate limits:

config.yaml

general_settings:
  max_parallel_requests: 100  # Concurrent requests
  global_max_parallel_requests: 1000
  
router_settings:
  rpm_limit: 10000  # Requests per minute
  tpm_limit: 1000000  # Tokens per minute

Per-key rate limits:

curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "rpm_limit": 100,
    "tpm_limit": 100000,
    "max_parallel_requests": 10
  }'

Per-team rate limits:

curl -X POST http://localhost:4000/team/new \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "team_alias": "engineering",
    "rpm_limit": 1000,
    "tpm_limit": 1000000
  }'

NGINX Rate Limiting

http {
    limit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/s;
    limit_conn_zone $binary_remote_addr zone=addr:10m;
    
    server {
        location / {
            limit_req zone=api_limit burst=20 nodelay;
            limit_conn addr 10;
            
            proxy_pass http://litellm:4000;
        }
    }
}

Cloudflare Protection

Use Cloudflare (or similar CDN) for DDoS protection and rate limiting.

Benefits:

DDoS mitigation
WAF (Web Application Firewall)
Rate limiting at edge
Bot detection
Geographic restrictions

Data Protection

Database Encryption

Enable PostgreSQL SSL:

config.yaml

general_settings:
  database_url: postgresql://user:pass@host:5432/db?sslmode=require

Encryption at rest:

# AWS RDS
aws rds create-db-instance \
  --db-instance-identifier litellm-db \
  --storage-encrypted \
  --kms-key-id arn:aws:kms:us-east-1:123456789:key/xxx

# GCP Cloud SQL
gcloud sql instances create litellm-db \
  --database-version=POSTGRES_16 \
  --tier=db-n1-standard-2 \
  --encryption-key=projects/PROJECT/locations/LOCATION/keyRings/RING/cryptoKeys/KEY

Redacting Sensitive Data

Mask PII in logs:

config.yaml

general_settings:
  redact_user_api_key_info: true  # Don't log API keys
  redact_messages_in_logs: true   # Don't log message content

litellm_settings:
  drop_params: true  # Drop request params from logs
  
router_settings:
  allowed_cache_controls: []  # Disable caching sensitive data

Request/Response Encryption

End-to-end encryption:

from cryptography.fernet import Fernet
import openai

key = Fernet.generate_key()
cipher = Fernet(key)

# Encrypt sensitive data before sending
message = "Confidential information"
encrypted = cipher.encrypt(message.encode())

# Send to LiteLLM
response = openai.ChatCompletion.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": encrypted.decode()}]
)

# Decrypt response
decrypted = cipher.decrypt(response.choices[0].message.content.encode())

Compliance

Data Minimization

Only store necessary data:

general_settings:
  store_model_in_db: true
  store_audit_logs: true
  log_retention_days: 90  # Automatic cleanup

Right to Deletion

Implement user data deletion:

# Delete user data
curl -X DELETE http://localhost:4000/user/delete \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"user_id": "[email protected]"}'

This removes:

User account
API keys
Spend logs (or anonymizes)
Personal metadata

Data Export

Allow users to export their data:

# Export user data
curl http://localhost:4000/user/[email protected] \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  > user_data.json

Consent Management

Track user consent:

# Add consent metadata
client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    extra_body={
        "metadata": {
            "user_consent": "analytics:true,storage:true",
            "consent_date": "2024-01-01"
        }
    }
)

SOC 2 / ISO 27001

Audit logging:

-- View audit trail
SELECT 
  updated_at,
  changed_by,
  action,
  table_name,
  object_id,
  before_value,
  updated_values
FROM "LiteLLM_AuditLog"
ORDER BY updated_at DESC;

Access logs:

general_settings:
  detailed_debug: true  # Detailed access logs
  
litellm_settings:
  success_callback: ["langfuse", "prometheus"]
  failure_callback: ["langfuse", "sentry"]

HIPAA Compliance

For HIPAA compliance, ensure BAA (Business Associate Agreement) with all LLM providers.

Requirements:

Enable encryption at rest and in transit
Implement access controls (RBAC)
Audit all access to PHI
Use HIPAA-compliant infrastructure (AWS, GCP, Azure)
Sign BAAs with:
- OpenAI (via Azure OpenAI)
- Anthropic (Enterprise plan)
- Google (Vertex AI)
- AWS (Bedrock)

Configuration:

general_settings:
  database_url: postgresql://...?sslmode=require
  redact_messages_in_logs: true
  
litellm_settings:
  drop_params: true
  
router_settings:
  allowed_cache_controls: []  # No caching of PHI

Container Security

Base Image Security

LiteLLM uses Chainguard Wolfi base images for minimal attack surface:

# Security-focused base image
ARG LITELLM_RUNTIME_IMAGE=cgr.dev/chainguard/wolfi-base

FROM $LITELLM_RUNTIME_IMAGE AS runtime

# Minimal dependencies
RUN apk add --no-cache bash openssl tzdata nodejs npm python3 py3-pip

# Non-root user (automatically handled)
USER nonroot

Security Scanning

Scan images for vulnerabilities:

# Trivy
trivy image ghcr.io/berriai/litellm:main-stable

# Grype
grype ghcr.io/berriai/litellm:main-stable

# Snyk
snyk container test ghcr.io/berriai/litellm:main-stable

Pod Security Standards (Kubernetes)

apiVersion: v1
kind: Pod
metadata:
  name: litellm
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    fsGroup: 1000
    seccompProfile:
      type: RuntimeDefault
  
  containers:
    - name: litellm
      securityContext:
        allowPrivilegeEscalation: false
        readOnlyRootFilesystem: false  # Prisma needs write access
        capabilities:
          drop:
            - ALL

Monitoring and Incident Response

Security Monitoring

Monitor for suspicious activity:

-- Unusual spend patterns
SELECT 
  api_key,
  COUNT(*) as requests,
  SUM(spend) as total_spend
FROM "LiteLLM_SpendLogs"
WHERE "startTime" >= NOW() - INTERVAL '1 hour'
GROUP BY api_key
HAVING SUM(spend) > 100
ORDER BY total_spend DESC;

-- Failed authentication attempts
SELECT 
  api_key,
  COUNT(*) as failures
FROM "LiteLLM_SpendLogs"
WHERE status = 'error'
  AND "startTime" >= NOW() - INTERVAL '1 hour'
GROUP BY api_key
HAVING COUNT(*) > 10;

Prometheus alerts:

- alert: SuspiciousActivity
  expr: rate(litellm_requests_total{status="error"}[5m]) > 10
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "High error rate from single source"

- alert: UnusualSpend
  expr: increase(litellm_spend_total[1h]) > 1000
  labels:
    severity: warning
  annotations:
    summary: "Unusual spend spike detected"

Incident Response Plan

Detection

Monitor alerts
Review audit logs
Check error rates

Containment

# Immediately block compromised key
curl -X POST http://localhost:4000/key/block \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"key": "sk-litellm-compromised"}'

# Revoke all team keys if needed
curl -X POST http://localhost:4000/team/block \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"team_id": "team-123"}'

Investigation

-- Trace all requests from compromised key
SELECT 
  request_id,
  model,
  "user",
  spend,
  "startTime",
  messages
FROM "LiteLLM_SpendLogs"
WHERE api_key = 'hashed-key'
ORDER BY "startTime" DESC;

Recovery

Rotate master key
Regenerate affected virtual keys
Update provider API keys
Patch vulnerabilities

Post-Mortem

Document incident
Update security policies
Improve monitoring
Train team

Security Checklist

Next Steps

Monitoring

Set up security monitoring and alerts

High Availability

Deploy securely at scale

Performance

Optimize without compromising security

Troubleshooting

Debug security-related issues

Deploy

Production

​Authentication and Authorization

​Master Key

​Virtual Keys

​Team-Based Access Control

​SSO Integration

​Secrets Management

​Environment Variables

​Kubernetes Secrets

​AWS Secrets Manager

​HashiCorp Vault

​Network Security

​TLS/SSL Configuration

​Firewall Rules

​Network Policies (Kubernetes)

​Rate Limiting and DDoS Protection

​Built-in Rate Limiting

​NGINX Rate Limiting

​Cloudflare Protection

​Data Protection

​Database Encryption

​Redacting Sensitive Data

​Request/Response Encryption

​Compliance

​GDPR Compliance

​SOC 2 / ISO 27001

​HIPAA Compliance

​Container Security

​Base Image Security

​Security Scanning

​Pod Security Standards (Kubernetes)

​Monitoring and Incident Response

​Security Monitoring

​Incident Response Plan

​Security Checklist

​Next Steps

Monitoring

High Availability

Performance

Troubleshooting

Build docs developers (and LLMs) love

Authentication and Authorization

Master Key

Virtual Keys

Team-Based Access Control

SSO Integration

Secrets Management

Environment Variables

Kubernetes Secrets

AWS Secrets Manager

HashiCorp Vault

Network Security

TLS/SSL Configuration

Firewall Rules

Network Policies (Kubernetes)

Rate Limiting and DDoS Protection

Built-in Rate Limiting

NGINX Rate Limiting

Cloudflare Protection

Data Protection

Database Encryption

Redacting Sensitive Data

Request/Response Encryption

Compliance

GDPR Compliance

SOC 2 / ISO 27001

HIPAA Compliance

Container Security

Base Image Security

Security Scanning

Pod Security Standards (Kubernetes)

Monitoring and Incident Response

Security Monitoring

Incident Response Plan

Security Checklist

Next Steps