Introduction to LiteLLM

What is LiteLLM?

LiteLLM is a unified interface for calling 100+ Large Language Models (LLMs) using the OpenAI format. It translates inputs to provider-specific completion, embedding, and image generation endpoints while providing consistent output across all providers. Performance: 8ms P95 latency at 1k RPS

Key Features

Python SDK

Use LiteLLM directly in your Python code with a simple, unified interface

AI Gateway (Proxy)

Deploy a centralized LLM gateway with authentication, cost tracking, and monitoring

100+ LLM Providers

OpenAI, Azure, Anthropic, Vertex AI, Bedrock, Groq, and many more

Router with Fallbacks

Load balancing, retry logic, and automatic fallbacks across deployments

Use Cases

LLMs - Call 100+ Models

LiteLLM supports all major completion endpoints including /chat/completions, /responses, /embeddings, /images, /audio, /batches, /rerank, /a2a, and /messages. Supported Providers: OpenAI, Anthropic, Azure, Vertex AI, Bedrock, Groq, Cohere, Mistral, Deepseek, Cerebras, and 90+ more providers.

Agents - Invoke A2A Agents

Connect to A2A (Agent-to-Agent) protocol agents from multiple providers:

LangGraph
Vertex AI Agent Engine
Azure AI Foundry
Bedrock AgentCore
Pydantic AI

MCP Tools - Connect MCP Servers

Integrate Model Context Protocol (MCP) servers with any LLM:

Load MCP tools in OpenAI format
Use with any LiteLLM-supported model
Compatible with Cursor IDE and other tools

How to Choose

	Python SDK	AI Gateway (Proxy)
Use Case	Direct integration in your Python codebase	Central service (LLM Gateway) to access multiple LLMs
Who Uses It?	Developers building LLM projects	Gen AI Enablement / ML Platform Teams
Key Features	Direct Python library integration Router with retry/fallback logic Application-level load balancing Exception handling with OpenAI-compatible errors Observability callbacks (Lunary, MLflow, Langfuse, etc.)	Centralized API gateway with auth Multi-tenant cost tracking per project/user Per-project customization (logging, guardrails, caching) Virtual keys for secure access control Admin dashboard UI for monitoring

Quick Start

Python SDK Quick Start

Get started with the Python SDK in 2 minutes

Proxy Quick Start

Deploy your AI Gateway in 5 minutes

OSS Adopters

Trusted by leading organizations:

Stripe - Financial infrastructure
Netflix - Content streaming
Google ADK - AI development
OpenAI Agents SDK - Official OpenAI integration
OpenHands - AI coding assistant
Greptile - Code understanding

Community & Support

Discord

Join our Discord community

Documentation

Full documentation

GitHub

Star us on GitHub

Next Steps

Choose Your Path

Decide whether you want to use the Python SDK or deploy the AI Gateway

Follow Quick Start

Complete the relevant quick start guide for your chosen path

Explore Features

Learn about advanced features like caching, fallbacks, and observability

Get Started

Python SDK

AI Gateway (Proxy)

Core Features

Advanced

Introduction to LiteLLM

What is LiteLLM?

Key Features

Python SDK

AI Gateway (Proxy)

100+ LLM Providers

Router with Fallbacks

Use Cases

LLMs - Call 100+ Models

Agents - Invoke A2A Agents

MCP Tools - Connect MCP Servers

How to Choose

Quick Start

Python SDK Quick Start

Proxy Quick Start

OSS Adopters

Community & Support

Discord

Documentation

GitHub

Next Steps

Build docs developers (and LLMs) love

Get Started

Python SDK

AI Gateway (Proxy)

Core Features

Advanced

​What is LiteLLM?

​Key Features

Python SDK

AI Gateway (Proxy)

100+ LLM Providers

Router with Fallbacks

​Use Cases

​LLMs - Call 100+ Models

​Agents - Invoke A2A Agents

​MCP Tools - Connect MCP Servers

​How to Choose

​Quick Start

Python SDK Quick Start

Proxy Quick Start

​OSS Adopters

​Community & Support

Discord

Documentation

GitHub

​Next Steps

Build docs developers (and LLMs) love

What is LiteLLM?

Key Features

Use Cases

LLMs - Call 100+ Models

Agents - Invoke A2A Agents

MCP Tools - Connect MCP Servers

How to Choose

Quick Start

OSS Adopters

Community & Support

Next Steps