FastAPI Deployment Overview

Deploying a FastAPI application is relatively straightforward, but understanding the key concepts will help you choose the best deployment strategy for your needs.

What Does Deployment Mean

To deploy an application means to perform the necessary steps to make it available to users. For a web API, deployment typically involves:

Putting it on a remote server or cloud platform
Using a server program that provides good performance and stability
Ensuring your users can access the application efficiently and reliably

This contrasts with the development stage, where you’re constantly changing code, breaking and fixing things, and restarting the development server.

Key Deployment Concepts

When deploying FastAPI applications, there are several critical concepts to understand:

Security - HTTPS

Configure SSL/TLS certificates to encrypt traffic between clients and your API. This is essential for production applications.

Running on Startup

Ensure your application starts automatically when the server boots, without manual intervention.

Restarts

Configure automatic restarts if your application crashes due to errors or other issues.

Replication

Run multiple worker processes to handle concurrent requests and utilize multiple CPU cores.

Memory Management

Monitor and optimize memory usage, especially when running multiple processes or handling large data.

Pre-Start Steps

Handle tasks like database migrations before starting your application.

ASGI Servers

FastAPI is built on ASGI (Asynchronous Server Gateway Interface). To run your application in production, you need an ASGI server.

Uvicorn

Uvicorn is the recommended ASGI server for FastAPI. It’s lightning-fast and production-ready.

# Install Uvicorn
pip install "uvicorn[standard]"

# Run your application
uvicorn main:app --host 0.0.0.0 --port 8000

The FastAPI CLI uses Uvicorn under the hood, so you can also use fastapi run for production deployments.

Alternative ASGI Servers

While Uvicorn is recommended, other ASGI servers are also compatible:

Hypercorn - Supports HTTP/2 and HTTP/3
Daphne - Django Channels ASGI server

Deployment Strategies

There are multiple ways to deploy FastAPI applications:

Self-Hosted Server

Deploy on your own server or virtual machine using:

Docker containers
Process managers (systemd, supervisor)
Reverse proxies (Nginx, Traefik)

Cloud Platforms

Use managed cloud services:

Platform as a Service (PaaS): Railway, Render, Heroku
Container Services: AWS ECS, Google Cloud Run, Azure Container Instances
Kubernetes: AWS EKS, Google GKE, Azure AKS
Serverless: AWS Lambda, Google Cloud Functions, Azure Functions

Container Orchestration

For larger deployments:

Kubernetes - Industry-standard orchestration
Docker Swarm - Simpler alternative to Kubernetes
Nomad - HashiCorp’s orchestrator

Start simple with a single server deployment, then scale to containers and orchestration as your needs grow.

Process Managers vs. Container Orchestration

Process Managers

For single-server deployments, use process managers to handle worker processes:

# Using Uvicorn with workers
uvicorn main:app --workers 4

# Or with FastAPI CLI
fastapi run --workers 4 main.py

Container Orchestration

For multi-server deployments, let the orchestrator handle replication:

One process per container
One Uvicorn process (no --workers)
Multiple containers managed by Kubernetes/Swarm

Don’t use multiple workers inside containers when using Kubernetes or similar orchestrators - let the orchestrator handle replication instead.

Performance Considerations

Worker Processes

The number of workers should typically be:

workers = (2 * CPU_cores) + 1

For example, on a 4-core machine:

fastapi run --workers 9 main.py

Async vs. Workers

FastAPI’s async capabilities allow handling many concurrent requests with a single worker. Consider:

I/O-bound applications: Fewer workers, leverage async
CPU-bound applications: More workers to utilize all cores

A single Uvicorn process can handle thousands of concurrent connections thanks to async/await.

Resource Utilization

Aim for efficient resource usage:

Target: 50-90% CPU and memory utilization
Monitor: Use tools like htop, docker stats, or cloud monitoring
Scale: Add workers/containers when consistently above 90%
Optimize: Reduce workers/containers if consistently below 50%

Next Steps

Explore specific deployment scenarios:

Docker Deployment - Containerize your FastAPI application
Server Workers - Configure Uvicorn with Gunicorn
HTTPS Setup - Secure your API with SSL/TLS
Cloud Deployment - Deploy to cloud platforms

The best deployment strategy depends on your specific requirements. Start with the simplest approach that meets your needs, then scale as necessary.

Get Started

Tutorial

Dependencies

Security

Application Structure

Database

Deployment

FastAPI Deployment Overview

What Does Deployment Mean

Key Deployment Concepts

ASGI Servers

Uvicorn

Alternative ASGI Servers

Deployment Strategies

Self-Hosted Server

Cloud Platforms

Container Orchestration

Process Managers vs. Container Orchestration

Process Managers

Container Orchestration

Performance Considerations

Worker Processes

Async vs. Workers

Resource Utilization

Next Steps

Build docs developers (and LLMs) love

Get Started

Tutorial

Dependencies

Security

Application Structure

Database

Deployment

​What Does Deployment Mean

​Key Deployment Concepts

​ASGI Servers

​Uvicorn

​Alternative ASGI Servers

​Deployment Strategies

​Self-Hosted Server

​Cloud Platforms

​Container Orchestration

​Process Managers vs. Container Orchestration

​Process Managers

​Container Orchestration

​Performance Considerations

​Worker Processes

​Async vs. Workers

​Resource Utilization

​Next Steps

Build docs developers (and LLMs) love

What Does Deployment Mean

Key Deployment Concepts

ASGI Servers

Uvicorn

Alternative ASGI Servers

Deployment Strategies

Self-Hosted Server

Cloud Platforms

Container Orchestration

Process Managers vs. Container Orchestration

Process Managers

Container Orchestration

Performance Considerations

Worker Processes

Async vs. Workers

Resource Utilization

Next Steps