Deployment Options
Kortix supports multiple production deployment strategies:Docker Compose
Simple deployment for single-server setups
AWS ECS
Managed container orchestration with auto-scaling
AWS EKS
Kubernetes-based deployment for enterprise scale
AWS Lightsail
Cost-effective cloud hosting for small workloads
Docker Compose Deployment
Best for single-server deployments or small teams.Server Setup
Provision Server
Set up a Linux server (Ubuntu 22.04 LTS recommended) with:
- 4+ CPU cores
- 8GB+ RAM
- 50GB+ disk space
- Public IP address
SSL/HTTPS Setup
Use Nginx or Caddy as a reverse proxy with automatic HTTPS:AWS ECS Deployment
Managed container orchestration with auto-scaling and high availability.Architecture
Prerequisites
- AWS account with appropriate permissions
- AWS CLI configured
- Domain name configured in Route53 or Cloudflare
- Container image pushed to ECR or GHCR
Infrastructure Setup
Kortix includes Pulumi infrastructure-as-code for ECS:ECS Configuration
Key ECS settings from the infrastructure:| Setting | Value | Description |
|---|---|---|
| Task CPU | 2048 (2 vCPU) | Per task |
| Task Memory | 4096 MB | Per task |
| Desired Count | 2-6 tasks | Auto-scales |
| Deployment Type | Rolling update | Zero downtime |
| Health Check | /v1/health-docker | Endpoint |
Auto-Scaling
ECS auto-scaling configuration:- Peak hours (Mon-Fri 6AM-6PM PT): 3-10 tasks
- Off-peak: 2-6 tasks
AWS EKS Deployment
Kubernetes-based deployment for enterprise scale with advanced features.Architecture
Prerequisites
- AWS account with EKS permissions
kubectlinstalled- Pulumi installed
- Container image in ECR/GHCR
Infrastructure Deployment
Configure EKS Settings
The infrastructure creates:
- EKS cluster (v1.31)
- Node group with c7i.2xlarge instances (8 vCPU, 16 GB RAM)
- Application Load Balancer
- Auto-scaling policies
- CloudWatch monitoring
Deploy Infrastructure
- EKS cluster:
suna-eks - Namespace:
suna - Deployment:
suna-api - Service:
suna-api(ClusterIP) - Ingress:
suna-api(ALB) - HPA:
suna-api(4-15 pods)
Kubernetes Configuration
Pod resources:- Startup: 10s initial delay, 12 attempts × 10s = 130s max startup time
- Readiness: Every 10s, removes from load balancer after 3 failures
- Liveness: Every 30s, restarts pod after 3 failures
EKS Operations
Deploy new version:AWS Lightsail Deployment
Cost-effective deployment for small workloads.Create Lightsail Instance
- Go to AWS Lightsail console
- Create instance with Ubuntu 22.04 LTS
- Choose plan: 2GB RAM minimum, 4GB recommended
- Enable static IP
CI/CD Pipeline
Kortix includes GitHub Actions workflows for automated deployments.Docker Build & Deploy
Workflow:.github/workflows/docker-build.yml
Triggers on push to PRODUCTION branch:
Push to Registry
Pushes to GitHub Container Registry with tags:
:prod(latest production):<commit-sha>(specific version)
Secrets Configuration
Configure GitHub repository secrets:| Secret | Description |
|---|---|
AWS_ACCESS_KEY_ID | AWS credentials for deployments |
AWS_SECRET_ACCESS_KEY | AWS credentials |
AWS_REGION | AWS region (e.g., us-west-2) |
LIGHTSAIL_SSH_KEY | SSH key for Lightsail access |
LIGHTSAIL_HOST | Lightsail instance IP |
Monitoring & Alerting
CloudWatch
AWS deployments include CloudWatch dashboards and alarms: Alarms:- CPU > 70% (warning) or > 85% (critical)
- Memory > 75% (warning) or > 90% (critical)
- Pod/Task count < 1 (critical)
- High latency (P99 > 2000ms)
- High error rate (> 5%)
suna-api-prod
Better Stack
For EKS deployments, Better Stack provides:- Real-time log aggregation
- Performance metrics
- Uptime monitoring
- Custom dashboards
Backup & Disaster Recovery
Database Backups
Supabase provides automatic backups:- Point-in-time recovery
- Daily snapshots
- Configurable retention
Application Backups
For self-managed deployments:Disaster Recovery Plan
Execute Recovery
- For pod crashes: Auto-restart (automatic)
- For bad deployment: Rollback with
kubectl rollout undo - For node failure: Auto-scaling adds replacement
- For complete failure: Redeploy from infrastructure code
Performance Optimization
Scaling Recommendations
Development/Testing:- 1-2 pods/tasks
- t3.medium or t4g.medium instances
- Single node
- 2-4 pods/tasks
- c7i.large or c6i.large instances
- 2 nodes (HA)
- 4-15 pods (auto-scaled)
- c7i.2xlarge instances
- 2-8 nodes (auto-scaled)
Cost Optimization
- Use Graviton (ARM) instances: 20% cheaper than x86
- Use Spot instances for non-critical workloads: 70% cheaper
- Enable auto-scaling: Scale down during off-peak hours
- Use CloudFront CDN for static assets
- Optimize LLM provider costs with model selection
Next Steps
Monitoring
Set up monitoring dashboards and alerts
Troubleshooting
Common deployment issues and solutions