Skip to main content

What are Providers?

LiteLLM provides a unified interface to call 100+ LLM providers using the OpenAI format. Instead of learning each provider’s unique API, you can use the same code structure across all providers.

Unified Interface

Use the same completion() function for all providers - just change the model name prefix

OpenAI Format

All responses follow OpenAI’s format, making it easy to switch between providers

100+ Providers

Access models from OpenAI, Anthropic, AWS, Google, Azure, and many more

Provider Features

Streaming, function calling, vision, embeddings - all standardized across providers

Quick Start

Here’s how easy it is to use different providers:
from litellm import completion
import os

os.environ["OPENAI_API_KEY"] = "your-api-key"

response = completion(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

Supported Endpoints

LiteLLM standardizes access to multiple endpoint types:
EndpointDescriptionSupported Providers
/chat/completionsText generation100+ providers
/embeddingsText embeddingsOpenAI, Azure, Bedrock, Cohere, Vertex AI, HuggingFace, and more
/images/generationsImage generationOpenAI, Azure, Vertex AI, Bedrock, and more
/audio/transcriptionsSpeech-to-textOpenAI, Azure, Groq, Deepgram
/audio/speechText-to-speechOpenAI, Azure, ElevenLabs
/moderationsContent moderationOpenAI, Azure
/batchesBatch processingOpenAI, Azure, Anthropic, Bedrock
/rerankDocument rerankingCohere, HuggingFace, Bedrock

Provider Categories

OpenAI

GPT-4o, GPT-4o-mini, O1, O3-mini, and more

Anthropic

Claude 4.6, Claude 3.7, Claude 3.5 Sonnet

AWS Bedrock

Claude, Llama, Mistral, Nova, and more on AWS

Google Vertex AI

Gemini 2.0, Gemini 1.5 Pro/Flash on Google Cloud

Azure OpenAI

GPT-4, GPT-3.5, and more on Azure

Key Features Across Providers

Streaming Support

All major providers support streaming responses for real-time output.
response = completion(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Write a story"}],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="")

Function Calling

LiteLLM standardizes function/tool calling across providers.
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            }
        }
    }
}]

response = completion(
    model="anthropic/claude-3-5-sonnet-20240620",
    messages=[{"role": "user", "content": "What's the weather in NYC?"}],
    tools=tools
)

Vision/Multimodal

Send images to vision-capable models using a consistent format.
response = completion(
    model="openai/gpt-4o",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {"url": "https://..."}}
        ]
    }]
)

Provider-Specific Features

While LiteLLM provides a unified interface, each provider has unique capabilities:
FeatureProviders
Prompt CachingAnthropic, Vertex AI
JSON ModeOpenAI, Azure, Anthropic, Vertex AI
Vision ModelsOpenAI, Anthropic, Vertex AI, Azure
Batch APIOpenAI, Azure, Anthropic, Bedrock
Reasoning ModelsOpenAI (O1, O3), Anthropic (Claude 4.6)
Computer UseAnthropic Claude
Web SearchAnthropic Claude, Perplexity

Model Naming Convention

LiteLLM uses a provider/model-name format:
# Format: provider/model-name
"openai/gpt-4o"              # OpenAI GPT-4o
"anthropic/claude-3-5-sonnet-20240620"  # Anthropic Claude
"bedrock/anthropic.claude-v2"  # AWS Bedrock Claude
"vertex_ai/gemini-2.0-flash-exp"  # Google Vertex AI Gemini
"azure/gpt-4"                # Azure OpenAI GPT-4
"groq/llama3-70b-8192"       # Groq Llama 3
"ollama/llama3"              # Local Ollama

Authentication

Each provider has its own authentication method:
Most providers use API keys set via environment variables:
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export COHERE_API_KEY="..."
export GROQ_API_KEY="gsk_..."

Error Handling

LiteLLM standardizes error handling across all providers:
from litellm import completion
from litellm.exceptions import (
    AuthenticationError,
    RateLimitError,
    ContextWindowExceededError,
    APIError
)

try:
    response = completion(
        model="openai/gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}]
    )
except AuthenticationError as e:
    print(f"Invalid API key: {e}")
except RateLimitError as e:
    print(f"Rate limit exceeded: {e}")
except ContextWindowExceededError as e:
    print(f"Message too long: {e}")
except APIError as e:
    print(f"API error: {e}")

Next Steps

OpenAI

Get started with OpenAI models

Anthropic

Use Claude models with advanced features

Streaming

Learn about streaming responses

Function Calling

Implement tool and function calling

Additional Resources

Build docs developers (and LLMs) love