Learn more about Mintlify

Enter your email to receive updates about new features and product releases.

Model Deployment

Deploy models in Microsoft Foundry using multiple options optimized for different scenarios.

Deployment Methods

Serverless API Deployment

Characteristics:

Pay-per-token billing
Microsoft-managed infrastructure
Automatic scaling
No capacity planning

Example:

# Model is accessed via serverless API endpoint
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages
)

Provisioned Throughput

Characteristics:

Reserved capacity (PTUs)
Predictable cost and performance
Dedicated resources
Fungible across models

Example:

deployment = client.deployments.create(
    model="gpt-4o",
    sku={
        "name": "ProvisionedManaged",
        "capacity": 100  # Provisioned Throughput Units
    }
)

Managed Compute

Characteristics:

Deploy to Azure VMs
Billed for VM hours
Supports open-source models
Full infrastructure control

Deployment Process

Select Model

Choose from model catalog based on requirements

Choose Deployment Option

Serverless, Provisioned, or Managed Compute

Configure Settings

Region, capacity, version settings

Deploy

Create deployment via portal, CLI, or SDK

Test

Verify deployment with test requests

Regional Considerations

Model availability varies by region
Check Region Support
Consider data residency requirements
Evaluate latency for global users

Model Lifecycle

GA: Full support and SLA
Deprecation Notice: 6-12 months warning
Deprecated: No new deployments
Retired: Model unavailable

Set up auto-update for seamless transitions. See Model Overview for model catalog details.

Foundry Models Overview

Model Region Support

⌘I

Build docs developers (and LLMs) love

Get started for free Talk to us

Getting Started

Core Concepts

Agents

Agent Tools

Models

Solutions

Responsible AI

Model Deployment

Model Deployment

Deployment Methods

Serverless API Deployment

Provisioned Throughput

Managed Compute

Deployment Process

Regional Considerations

Model Lifecycle

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Agents

Agent Tools

Models

Solutions

Responsible AI

​Model Deployment

​Deployment Methods

​Serverless API Deployment

​Provisioned Throughput

​Managed Compute

​Deployment Process

​Regional Considerations

​Model Lifecycle

Build docs developers (and LLMs) love

Model Deployment

Deployment Methods

Serverless API Deployment

Provisioned Throughput

Managed Compute

Deployment Process

Regional Considerations

Model Lifecycle