Overview
LiteLLM provides comprehensive support for Azure OpenAI Service, allowing you to use GPT-4, GPT-3.5, embeddings, and more through your Azure deployments.Quick Start
Authentication
- Environment Variables
- Direct Parameters
- Azure Active Directory
- Managed Identity
Set Azure credentials via environment variables:
Model Naming
Azure uses deployment names, not model names. Format:azure/{deployment_name}
Common Azure Deployments
GPT-4o
GPT-4
GPT-3.5 Turbo
Embeddings
API Versions
Azure OpenAI uses API versions. Recommended versions:| Version | Features | Recommended For |
|---|---|---|
2024-02-15-preview | Latest features | Production use |
2024-08-01-preview | Newest preview | Testing new features |
2023-12-01-preview | Stable | Legacy support |
Streaming
Function Calling
Vision (Multimodal)
Use GPT-4 Vision on Azure:Embeddings
Generate embeddings using Azure:Azure Embedding Models
Image Generation (DALL-E)
Generate images using DALL-E on Azure:Audio Transcription (Whisper)
Transcribe audio using Whisper on Azure:Text-to-Speech
Generate speech from text:Batch Processing
Process requests in batches:Advanced Features
JSON Mode
Seed for Reproducibility
Logprobs
Multiple Azure Deployments
Use different Azure resources:Content Filtering
Azure applies content filtering by default:Error Handling
Cost Tracking
Regional Deployments
Azure OpenAI is available in multiple regions:Best Practices
Use Latest API Version
Always use the latest stable API version for new features and improvements.
Handle Content Filters
Azure applies content filtering - handle these responses appropriately.
Use Managed Identity
For Azure-hosted apps, use Managed Identity instead of API keys.
Monitor Rate Limits
Track TPM (tokens per minute) and RPM (requests per minute) limits.
Troubleshooting
Deployment Not Found
API Version Issues
Related Documentation
OpenAI
Learn about OpenAI models and features
Streaming
Stream responses in real-time
Function Calling
Implement function calling
Embeddings
Generate embeddings on Azure