Deploy your AgentOS applications to production with confidence using these battle-tested strategies and best practices.
Deployment Overview
AgentOS is a FastAPI application and can be deployed anywhere Python web applications run. This guide covers:
- Local development setup
- Production server deployment (Uvicorn, Gunicorn)
- Containerization with Docker
- Cloud platform deployment
- Environment configuration
- Performance optimization
Local Development
For development, use the built-in serve() method:
from agno.os import AgentOS
agent_os = AgentOS(agents=[agent])
app = agent_os.get_app()
if __name__ == "__main__":
agent_os.serve(
app="app:app",
host="localhost",
port=7777,
reload=True, # Auto-reload on code changes
)
Run with:
Production Server
Using Uvicorn
Uvicorn is an ASGI server optimized for async Python applications:
# Install uvicorn
pip install uvicorn[standard]
# Run with production settings
uvicorn app:app \
--host 0.0.0.0 \
--port 7777 \
--workers 4 \
--log-level info
For production, always bind to 0.0.0.0 to accept external connections, not localhost.
Using Gunicorn with Uvicorn Workers
For better process management and zero-downtime restarts:
# Install gunicorn
pip install gunicorn uvicorn[standard]
# Run with gunicorn
gunicorn app:app \
--workers 4 \
--worker-class uvicorn.workers.UvicornWorker \
--bind 0.0.0.0:7777 \
--timeout 300 \
--graceful-timeout 120 \
--access-logfile - \
--error-logfile -
Increase timeout values when running long-running agent workflows. Default timeout is 30 seconds.
Process Manager Configuration
Create a gunicorn.conf.py:
import multiprocessing
import os
# Server socket
bind = "0.0.0.0:7777"
# Worker processes
workers = int(os.getenv("WEB_CONCURRENCY", multiprocessing.cpu_count() * 2 + 1))
worker_class = "uvicorn.workers.UvicornWorker"
# Timeouts
timeout = 300 # 5 minutes for long-running agents
graceful_timeout = 120
keepalive = 5
# Logging
accesslog = "-"
errorlog = "-"
loglevel = "info"
# Process naming
proc_name = "agentos"
# Server mechanics
preload_app = True # Load application code before worker processes fork
max_requests = 1000 # Restart workers after this many requests
max_requests_jitter = 50 # Randomize restart to avoid thundering herd
Run with:
gunicorn -c gunicorn.conf.py app:app
Docker Deployment
Basic Dockerfile
Create a production-ready Docker image:
FROM python:3.11-slim
# Set working directory
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
gcc \
postgresql-client \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
# Create non-root user
RUN useradd -m -u 1000 agno && chown -R agno:agno /app
USER agno
# Expose port
EXPOSE 7777
# Run with gunicorn
CMD ["gunicorn", "-c", "gunicorn.conf.py", "app:app"]
Multi-stage Build
Optimize image size with multi-stage builds:
# Build stage
FROM python:3.11-slim as builder
WORKDIR /app
RUN apt-get update && apt-get install -y gcc
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt
# Runtime stage
FROM python:3.11-slim
WORKDIR /app
# Install runtime dependencies only
RUN apt-get update && apt-get install -y \
postgresql-client \
&& rm -rf /var/lib/apt/lists/*
# Copy installed packages from builder
COPY --from=builder /root/.local /root/.local
# Copy application
COPY . .
# Make sure scripts in .local are usable
ENV PATH=/root/.local/bin:$PATH
RUN useradd -m -u 1000 agno && chown -R agno:agno /app
USER agno
EXPOSE 7777
CMD ["gunicorn", "-c", "gunicorn.conf.py", "app:app"]
Docker Compose
Orchestrate AgentOS with PostgreSQL:
version: '3.8'
services:
postgres:
image: pgvector/pgvector:pg16
environment:
POSTGRES_USER: agno
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
POSTGRES_DB: agno
volumes:
- postgres_data:/var/lib/postgresql/data
ports:
- "5432:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U agno"]
interval: 10s
timeout: 5s
retries: 5
agentos:
build: .
ports:
- "7777:7777"
environment:
POSTGRES_DB_URL: postgresql+psycopg://agno:${POSTGRES_PASSWORD}@postgres:5432/agno
OPENAI_API_KEY: ${OPENAI_API_KEY}
OS_SECURITY_KEY: ${OS_SECURITY_KEY}
JWT_VERIFICATION_KEY: ${JWT_VERIFICATION_KEY}
depends_on:
postgres:
condition: service_healthy
restart: unless-stopped
volumes:
postgres_data:
Run with:
# Create .env file
cat > .env << EOF
POSTGRES_PASSWORD=secure_password
OPENAI_API_KEY=sk-...
OS_SECURITY_KEY=your-secret-key
JWT_VERIFICATION_KEY=your-jwt-key
EOF
# Start services
docker-compose up -d
# View logs
docker-compose logs -f agentos
Railway
Install Railway CLI
npm install -g @railway/cli
Configure environment
Set environment variables in Railway dashboard:
OPENAI_API_KEY
OS_SECURITY_KEY
- Railway automatically provides
DATABASE_URL
Render
Create render.yaml:
services:
- type: web
name: agentos
env: python
buildCommand: pip install -r requirements.txt
startCommand: gunicorn -c gunicorn.conf.py app:app
envVars:
- key: PYTHON_VERSION
value: 3.11.0
- key: OPENAI_API_KEY
sync: false
- key: OS_SECURITY_KEY
generateValue: true
- key: DATABASE_URL
fromDatabase:
name: agentos-db
property: connectionString
databases:
- name: agentos-db
databaseName: agno
user: agno
Deploy via Render dashboard or CLI.
Google Cloud Run
Build container
gcloud builds submit --tag gcr.io/PROJECT_ID/agentos
Deploy to Cloud Run
gcloud run deploy agentos \
--image gcr.io/PROJECT_ID/agentos \
--platform managed \
--region us-central1 \
--set-env-vars OPENAI_API_KEY=sk-... \
--set-env-vars DATABASE_URL=postgresql://... \
--timeout 300 \
--memory 2Gi \
--allow-unauthenticated
AWS ECS
Create task definition:
{
"family": "agentos",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "1024",
"memory": "2048",
"containerDefinitions": [
{
"name": "agentos",
"image": "YOUR_ECR_REPO/agentos:latest",
"portMappings": [
{
"containerPort": 7777,
"protocol": "tcp"
}
],
"environment": [
{
"name": "POSTGRES_DB_URL",
"value": "postgresql://..."
}
],
"secrets": [
{
"name": "OPENAI_API_KEY",
"valueFrom": "arn:aws:secretsmanager:..."
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/agentos",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
}
}
]
}
Environment Configuration
Environment Variables
Create a .env file for local development:
# Database
POSTGRES_DB_URL=postgresql+psycopg://user:pass@localhost:5432/agno
# API Keys
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
# Security
OS_SECURITY_KEY=your-secret-key-at-least-256-bits
JWT_VERIFICATION_KEY=your-jwt-public-key
# AgentOS Configuration
AGNO_API_RUNTIME=production
WEB_CONCURRENCY=4
# Monitoring
OTEL_EXPORTER_OTLP_ENDPOINT=https://your-trace-collector
Load environment variables:
import os
from dotenv import load_dotenv
load_dotenv()
db_url = os.getenv("POSTGRES_DB_URL")
agent_os = AgentOS(
agents=[agent],
authorization=True,
)
Secrets Management
Never commit secrets to version control. Use environment variables or secret managers.
AWS Secrets Manager
import boto3
import json
def get_secret(secret_name):
client = boto3.client('secretsmanager')
response = client.get_secret_value(SecretId=secret_name)
return json.loads(response['SecretString'])
secrets = get_secret('agentos/production')
db_url = secrets['database_url']
Google Secret Manager
from google.cloud import secretmanager
def access_secret(project_id, secret_id):
client = secretmanager.SecretManagerServiceClient()
name = f"projects/{project_id}/secrets/{secret_id}/versions/latest"
response = client.access_secret_version(request={"name": name})
return response.payload.data.decode('UTF-8')
db_url = access_secret('my-project', 'database-url')
Database Configuration
Connection Pooling
Configure PostgreSQL connection pools:
from agno.db.postgres import PostgresDb
db = PostgresDb(
id="production-db",
db_url="postgresql+psycopg://user:pass@host:5432/agno",
pool_size=20, # Default: 5
max_overflow=40, # Default: 10
pool_timeout=30, # Default: 30
pool_recycle=3600, # Recycle connections after 1 hour
)
Read Replicas
Use read replicas for scaling:
write_db = PostgresDb(
id="primary-db",
db_url="postgresql://primary:5432/agno",
)
read_db = PostgresDb(
id="replica-db",
db_url="postgresql://replica:5432/agno",
)
# Use read replica for read-heavy agents
analysis_agent = Agent(name="Analyst", db=read_db)
# Use primary for write-heavy agents
writer_agent = Agent(name="Writer", db=write_db)
Database Migrations
AgentOS auto-provisions tables, but for custom schemas use Alembic:
# Install alembic
pip install alembic
# Initialize
alembic init migrations
# Create migration
alembic revision --autogenerate -m "Add custom table"
# Apply migration
alembic upgrade head
Worker Configuration
Calculate optimal worker count:
import multiprocessing
# CPU-bound workloads
workers = multiprocessing.cpu_count() * 2 + 1
# I/O-bound workloads (default for agents)
workers = multiprocessing.cpu_count() * 4
# Memory-constrained environments
workers = min(multiprocessing.cpu_count() * 2, 4)
Async Operations
Ensure async is used for I/O operations:
from agno.db.postgres import PostgresDb
# Use async database
db = PostgresDb(db_url="postgresql+psycopg://...")
# AgentOS handles async automatically
agent_os = AgentOS(
agents=[agent],
auto_provision_dbs=True, # Async table creation
)
Background Tasks
Run hooks and evaluations in background:
agent_os = AgentOS(
agents=[agent],
run_hooks_in_background=True, # Non-blocking hooks
)
Caching
Implement response caching:
from functools import lru_cache
from fastapi import FastAPI
from fastapi_cache import FastAPICache
from fastapi_cache.backends.redis import RedisBackend
from redis import asyncio as aioredis
app = agent_os.get_app()
@app.on_event("startup")
async def startup():
redis = aioredis.from_url("redis://localhost")
FastAPICache.init(RedisBackend(redis), prefix="agentos-cache")
Health Checks
AgentOS provides a /health endpoint:
curl http://localhost:7777/health
Custom health checks:
from fastapi import FastAPI
app = agent_os.get_app()
@app.get("/health/detailed")
async def detailed_health():
# Check database connectivity
db_healthy = await check_database()
# Check external services
api_healthy = await check_external_apis()
return {
"status": "healthy" if db_healthy and api_healthy else "unhealthy",
"database": db_healthy,
"apis": api_healthy,
}
Monitoring
Logging
Configure structured logging:
import logging
import json
class JSONFormatter(logging.Formatter):
def format(self, record):
log_obj = {
"timestamp": self.formatTime(record),
"level": record.levelname,
"message": record.getMessage(),
"module": record.module,
}
return json.dumps(log_obj)
handler = logging.StreamHandler()
handler.setFormatter(JSONFormatter())
logging.root.addHandler(handler)
logging.root.setLevel(logging.INFO)
Metrics
Enable built-in metrics:
agent_os = AgentOS(
agents=[agent],
tracing=True, # OpenTelemetry tracing
)
Access metrics:
curl http://localhost:7777/metrics
Tracing
Integrate with OpenTelemetry:
import os
os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = "https://your-collector"
agent_os = AgentOS(
agents=[agent],
tracing=True,
db=db, # Required for tracing storage
)
Security Best Practices
Use HTTPS
Always use HTTPS in production with valid SSL certificates
Enable Authentication
Configure JWT authentication and RBAC for all endpoints
Rotate Secrets
Regularly rotate API keys, tokens, and database passwords
Network Security
Use VPCs, security groups, and firewall rules
HTTPS with Nginx
server {
listen 443 ssl http2;
server_name your-domain.com;
ssl_certificate /etc/ssl/certs/cert.pem;
ssl_certificate_key /etc/ssl/private/key.pem;
location / {
proxy_pass http://localhost:7777;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket support
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
Troubleshooting
Common Issues
Workers timing out
# Increase timeout in gunicorn.conf.py
timeout = 600 # 10 minutes
Database connection errors
# Check connection pool settings
# Increase pool_size and max_overflow in PostgresDb
Out of memory
# Reduce number of workers
export WEB_CONCURRENCY=2
# Or increase container memory
Slow response times
# Enable background tasks
agent_os = AgentOS(
agents=[agent],
run_hooks_in_background=True,
)
Next Steps
Authentication
Secure your deployment with JWT and RBAC
API Reference
Complete endpoint documentation