Skip to main content

What is the AI Gateway?

LiteLLM’s AI Gateway (Proxy) is a production-ready server that provides a unified interface to 100+ LLM providers. It acts as a central gateway that handles authentication, load balancing, budgets, rate limits, and observability for all your AI model requests.

Key Features

Unified API Interface

  • OpenAI-Compatible: All requests use the OpenAI API format, regardless of the underlying provider
  • Multi-Provider Support: Access OpenAI, Anthropic, Azure, Bedrock, Vertex AI, Cohere, and 100+ providers through a single endpoint
  • Model Routing: Automatically route requests to the best available model based on your configuration

Authentication & Authorization

  • Virtual Keys: Generate API keys with custom budgets, rate limits, and model access
  • Team Management: Organize users into teams with shared budgets and permissions
  • JWT Authentication: Support for custom JWT authentication flows
  • Master Key: Secure admin access to all proxy management endpoints

Cost Control & Budgets

  • Budget Tracking: Set budgets per key, user, or team
  • Soft Budgets: Send alerts before hitting budget limits
  • Budget Alerts: Webhook notifications when budgets are exceeded
  • Spend Tracking: Real-time spend tracking across all requests

Load Balancing & Reliability

  • Automatic Fallbacks: Retry failed requests on alternative deployments
  • Usage-Based Routing: Distribute load across multiple deployments based on usage
  • Health Checks: Continuous monitoring of model availability
  • Rate Limiting: Prevent abuse with configurable rate limits

Observability

  • Logging Callbacks: Send logs to Langfuse, Lunary, Helicone, Weights & Biases, and more
  • Prometheus Metrics: Export metrics for monitoring and alerting
  • Request Tracing: Track requests across the entire lifecycle
  • Admin Dashboard: Web UI for managing keys, users, and viewing analytics

Architecture

Client Applications

    [API Keys]

LiteLLM Proxy Server

  [Load Balancer]

┌───────┴────────┐
│                │
OpenAI       Anthropic
Azure        Bedrock
Vertex AI    Cohere
... 100+ providers

Common Use Cases

Enterprise AI Gateway

  • Centralized control over all AI model access
  • Budget management and cost allocation
  • Compliance and audit logging
  • Team-based access control

Development Platform

  • Provide AI capabilities to multiple applications
  • Manage API keys for different environments
  • Track usage across projects
  • Test different models without code changes

Production Applications

  • High availability with automatic fallbacks
  • Load balancing across deployments
  • Rate limiting and abuse prevention
  • Observability and monitoring

How It Works

1

Deploy the Proxy

Run the LiteLLM proxy server with your configuration file
2

Configure Models

Define your model deployments in config.yaml with API keys and settings
3

Generate Virtual Keys

Create API keys for your applications with specific budgets and permissions
4

Make Requests

Use the virtual keys to make OpenAI-compatible requests to any provider

Request Flow

  1. Authentication: Client sends request with virtual key
  2. Authorization: Proxy validates the key, checks budgets, and permissions
  3. Routing: Request is routed to the appropriate model deployment
  4. Execution: Provider API is called with transformed request
  5. Response: Response is transformed to OpenAI format and returned
  6. Logging: Request metadata is logged to configured callbacks

Core Components

Proxy Server

The main FastAPI server (proxy_server.py) that handles all incoming requests and routes them to the appropriate handlers.

Router

Load balancing component that distributes requests across multiple model deployments based on health, usage, and configuration.

Database

Optional PostgreSQL database for storing:
  • Virtual keys and their configurations
  • User and team information
  • Spend tracking and budget data
  • Request logs

Cache Layer

Redis-based caching for:
  • API key validation
  • User/team data
  • Response caching (optional)
  • Health check coordination

Next Steps

Quick Start

Get started with the AI Gateway in 5 minutes

Docker Deployment

Deploy with Docker and Docker Compose

Configuration

Learn about all configuration options

Virtual Keys

Manage API keys and authentication

Build docs developers (and LLMs) love