Overview
Azure OpenAI Service provides REST API access to OpenAI’s models including GPT-4, GPT-3.5-Turbo, and embeddings through Microsoft Azure’s enterprise-grade infrastructure with enhanced security, compliance, and regional availability.
Base URL: https://{resourceName}.openai.azure.com/openai
Supported Features
✅ Chat Completions (including streaming)
✅ Completions (legacy)
✅ Embeddings
✅ Image Generation (DALL-E)
✅ Image Editing
✅ Text-to-Speech (TTS)
✅ Speech-to-Text (Whisper)
✅ Audio Translation
✅ Function Calling & Tools
✅ Vision (GPT-4 Vision)
✅ Batch API
✅ Fine-tuning
✅ Multiple Authentication Methods
Quick Start
Basic Configuration
from portkey_ai import Portkey
client = Portkey(
provider = "azure-openai" ,
api_key = "***" , # Azure API key
resource_name = "my-resource" , # Your Azure resource name
deployment_id = "gpt-4-deployment" , # Your deployment name
api_version = "2024-02-15-preview" # API version
)
response = client.chat.completions.create(
model = "gpt-4" , # Not used in Azure, deployment_id is used instead
messages = [
{ "role" : "system" , "content" : "You are a helpful assistant." },
{ "role" : "user" , "content" : "What is Azure OpenAI?" }
]
)
print (response.choices[ 0 ].message.content)
Configuration Options
Required Parameters
Parameter Description Example resource_nameAzure OpenAI resource name my-openai-resourcedeployment_idDeployment name in Azure gpt-4-deploymentapi_versionAzure API version 2024-02-15-previewapi_keyAzure API key ***
Authentication Methods
Azure OpenAI supports multiple authentication methods:
1. API Key (Default)
client = Portkey(
provider = "azure-openai" ,
api_key = "***" ,
resource_name = "my-resource" ,
deployment_id = "gpt-4" ,
api_version = "2024-02-15-preview"
)
2. Azure AD Token
client = Portkey(
provider = "azure-openai" ,
azure_ad_token = "Bearer eyJ0eXAiOiJKV1QiLCJhbGc..." ,
resource_name = "my-resource" ,
deployment_id = "gpt-4" ,
api_version = "2024-02-15-preview"
)
3. Entra ID (Service Principal)
client = Portkey(
provider = "azure-openai" ,
azure_auth_mode = "entra" ,
azure_entra_tenant_id = "***" ,
azure_entra_client_id = "***" ,
azure_entra_client_secret = "***" ,
azure_entra_scope = "https://cognitiveservices.azure.com/.default" ,
resource_name = "my-resource" ,
deployment_id = "gpt-4" ,
api_version = "2024-02-15-preview"
)
4. Managed Identity
client = Portkey(
provider = "azure-openai" ,
azure_auth_mode = "managed" ,
azure_managed_client_id = "***" , # Optional
azure_entra_scope = "https://cognitiveservices.azure.com/" ,
resource_name = "my-resource" ,
deployment_id = "gpt-4" ,
api_version = "2024-02-15-preview"
)
5. Workload Identity (Kubernetes)
client = Portkey(
provider = "azure-openai" ,
azure_auth_mode = "workload" ,
azure_workload_client_id = "***" ,
azure_entra_scope = "https://cognitiveservices.azure.com/.default" ,
resource_name = "my-resource" ,
deployment_id = "gpt-4" ,
api_version = "2024-02-15-preview"
)
# Requires environment variables:
# AZURE_AUTHORITY_HOST
# AZURE_TENANT_ID
# AZURE_FEDERATED_TOKEN_FILE
API Versions
Azure OpenAI uses API versions for versioning. Common versions:
API Version Features Status 2024-02-15-previewLatest features, GPT-4 Turbo Preview 2023-12-01-previewGPT-4 Vision, DALL-E 3 Preview 2023-05-15Stable release GA v1OpenAI-compatible Special
Use api_version="v1" for OpenAI-compatible endpoints without deployment IDs.
Available Deployments
You must create deployments in Azure before using them:
Model Recommended Deployment Name Capabilities GPT-4 gpt-4Advanced reasoning GPT-4 Turbo gpt-4-turbo128K context GPT-3.5 Turbo gpt-35-turboFast, cost-effective text-embedding-ada-002 text-embedding-ada-002Embeddings DALL-E 3 dall-e-3Image generation Whisper whisperSpeech-to-text
Advanced Features
Streaming
stream = client.chat.completions.create(
model = "gpt-4" ,
messages = [{ "role" : "user" , "content" : "Count to 5" }],
stream = True
)
for chunk in stream:
if chunk.choices[ 0 ].delta.content:
print (chunk.choices[ 0 ].delta.content, end = "" )
Function Calling
tools = [
{
"type" : "function" ,
"function" : {
"name" : "get_weather" ,
"description" : "Get weather information" ,
"parameters" : {
"type" : "object" ,
"properties" : {
"location" : { "type" : "string" }
},
"required" : [ "location" ]
}
}
}
]
response = client.chat.completions.create(
model = "gpt-4" ,
messages = [{ "role" : "user" , "content" : "What's the weather in Seattle?" }],
tools = tools
)
Vision
response = client.chat.completions.create(
model = "gpt-4-vision" ,
messages = [{
"role" : "user" ,
"content" : [
{ "type" : "text" , "text" : "What's in this image?" },
{
"type" : "image_url" ,
"image_url" : { "url" : "https://example.com/image.jpg" }
}
]
}]
)
Embeddings
response = client.embeddings.create(
model = "text-embedding-ada-002" ,
input = "Azure OpenAI provides enterprise-grade AI"
)
embedding = response.data[ 0 ].embedding
Image Generation
response = client.images.generate(
model = "dall-e-3" ,
prompt = "A futuristic datacenter in the clouds" ,
size = "1024x1024" ,
quality = "hd"
)
image_url = response.data[ 0 ].url
Multi-Region Configuration
Load balance across multiple Azure regions:
config = {
"strategy" : { "mode" : "loadbalance" },
"targets" : [
{
"provider" : "azure-openai" ,
"api_key" : "***" ,
"resource_name" : "eastus-resource" ,
"deployment_id" : "gpt-4" ,
"api_version" : "2024-02-15-preview" ,
"weight" : 0.5
},
{
"provider" : "azure-openai" ,
"api_key" : "***" ,
"resource_name" : "westus-resource" ,
"deployment_id" : "gpt-4" ,
"api_version" : "2024-02-15-preview" ,
"weight" : 0.5
}
]
}
client = Portkey().with_options( config = config)
Fallback to OpenAI
Fallback to standard OpenAI if Azure is unavailable:
config = {
"strategy" : { "mode" : "fallback" },
"targets" : [
{
"provider" : "azure-openai" ,
"api_key" : "***" ,
"resource_name" : "my-resource" ,
"deployment_id" : "gpt-4" ,
"api_version" : "2024-02-15-preview"
},
{
"provider" : "openai" ,
"api_key" : "sk-***" ,
"override_params" : { "model" : "gpt-4" }
}
]
}
client = Portkey().with_options( config = config)
Error Handling
from portkey_ai.exceptions import (
RateLimitError,
APIError,
AuthenticationError
)
try :
response = client.chat.completions.create(
model = "gpt-4" ,
messages = [{ "role" : "user" , "content" : "Hello" }]
)
except RateLimitError as e:
print ( f "Rate limit: { e } " )
except AuthenticationError as e:
print ( f "Auth error: { e } " )
except APIError as e:
print ( f "API error: { e } " )
Key Differences from OpenAI
Aspect OpenAI Azure OpenAI Authentication API key API key, AD token, Managed Identity Endpoint Fixed Custom resource name Model Specification model parameterdeployment_idAPI Versioning Not required Required api_version Regional Global Multi-region support Compliance Standard Enterprise (HIPAA, SOC 2, etc.) Data Location US Choose your region
Request URL Structure
https://{resource_name}.openai.azure.com/openai/deployments/{deployment_id}/{endpoint}?api-version={api_version}
Example:
https://my-resource.openai.azure.com/openai/deployments/gpt-4/chat/completions?api-version=2024-02-15-preview
Best Practices
Use Managed Identity - Most secure for Azure-hosted applications
Deploy to multiple regions - Better availability and latency
Set appropriate api_version - Use stable versions for production
Monitor quota limits - Azure has per-deployment quotas
Use private endpoints - Enhanced security for enterprise
Implement retry logic - Handle transient failures
Cache responses - Reduce costs and latency
Enterprise Features
Private Endpoints : Connect via Azure Private Link
Customer Managed Keys : Bring your own encryption keys
Virtual Networks : Restrict access to your VNet
Managed Identity : Eliminate credential management
Azure Monitor : Full observability integration
Compliance : HIPAA, SOC 2, ISO 27001, GDPR
Data Residency : Keep data in your region
Pricing
Azure OpenAI pricing is similar to OpenAI but billed through Azure:
Azure OpenAI Pricing View Azure OpenAI Service pricing
OpenAI Standard OpenAI integration
Load Balancing Multi-region load balancing
Fallbacks Fallback configurations
Enterprise Deployment Enterprise deployment guide