The deploy section configures how services are deployed across your cluster. This includes replica count, resource limits, placement, and update behavior.
Deployment mode
Services can run in replicated or global mode:
services:
# Replicated: specific number of containers
web:
image: nginx
deploy:
mode: replicated
replicas: 3
# Global: one container per machine
monitoring:
image: prometheus/node-exporter
deploy:
mode: global
Modes:
replicated - Run a specific number of replicas (default)
global - Run one container on every machine in the cluster
References: pkg/api/service.go:19-21, pkg/client/compose/service.go:98-107
Replicas
Set the number of container replicas for replicated services:
services:
api:
image: myapi
deploy:
replicas: 5 # Run 5 containers
You can also use the top-level scale key:
services:
worker:
image: worker
scale: 10 # Shorthand for deploy.replicas
If both scale and deploy.replicas are set, deploy.replicas takes precedence.
Default is 1 replica if not specified.
References: pkg/api/service.go:70-71, pkg/client/compose/service.go:93-104
Placement constraints
Control which machines can run your service using the x-machines extension:
services:
gpu-worker:
image: ml-worker
x-machines:
- gpu-machine-1
- gpu-machine-2
deploy:
replicas: 4 # Distributed across specified machines
You can specify machine names or IDs:
services:
db:
image: postgres:16
x-machines: db-server # Single machine (string)
deploy:
replicas: 1
Format:
- String:
x-machines: machine-name (single machine)
- Array:
x-machines: [machine-1, machine-2] (multiple machines)
If a service has replicas > 1 and uses x-machines, Uncloud spreads replicas across the specified machines.
References: pkg/api/service.go:65-66, pkg/client/compose/service.go:76-78, test/e2e/compose_deploy_test.go:390-489
Update configuration
Configure rolling update behavior:
services:
web:
image: nginx
deploy:
replicas: 4
update_config:
order: start-first
monitor: 30s
References: pkg/api/service.go:441-452
Update order
Control the order of container replacement during updates:
services:
# Stateless service: minimize downtime
api:
image: myapi
deploy:
update_config:
order: start-first # Start new before stopping old
# Stateful service: prevent data corruption
db:
image: postgres:16
volumes:
- db-data:/var/lib/postgresql/data
deploy:
update_config:
order: stop-first # Stop old before starting new
Update orders:
start-first - Start new container before stopping old (default for stateless)
- Minimizes downtime
- Briefly runs both containers
stop-first - Stop old container before starting new (default for services with volumes)
- Prevents data corruption
- Causes brief downtime
References: pkg/api/service.go:23-28, pkg/client/compose/service.go:109-120
Monitor period
How long to monitor containers after starting to ensure they’re healthy:
services:
app:
image: myapp
healthcheck:
test: ["CMD", "curl", "http://localhost/health"]
interval: 10s
deploy:
update_config:
monitor: 60s # Wait up to 60s for health check
The update waits for:
- Container to pass health check, or
- Monitor period to elapse and container still running
Default monitor period is used if not specified.
References: pkg/api/service.go:447-451, pkg/client/compose/service.go:121-123
Resource limits
Set CPU and memory limits for containers:
services:
app:
image: myapp
deploy:
resources:
limits:
cpus: '1.5' # 1.5 CPU cores
memory: 1G # 1 gigabyte RAM
reservations:
memory: 512M # Reserve 512 megabytes
You can also use top-level keys:
services:
app:
image: myapp
cpus: 0.5 # Shorthand for deploy.resources.limits.cpus
mem_limit: 256M # Shorthand for deploy.resources.limits.memory
mem_reservation: 128M # Shorthand for deploy.resources.reservations.memory
References: pkg/client/compose/service.go:175-228
CPU limits
Limit CPU usage:
services:
cpu-intensive:
image: myapp
deploy:
resources:
limits:
cpus: '2.0' # Use up to 2 CPU cores
Format: String or number representing CPU cores (e.g., '0.5', '2.0')
Memory limits
Limit memory usage:
services:
memory-intensive:
image: myapp
deploy:
resources:
limits:
memory: 2G # 2 gigabytes maximum
reservations:
memory: 1G # 1 gigabyte guaranteed
Formats: 100M, 1G, 512m, etc.
If a container exceeds its memory limit, Docker kills and restarts it.
Device reservations
Reserve GPU and other devices:
services:
ml-worker:
image: tensorflow/tensorflow:latest-gpu
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1 # Number of GPUs
capabilities: [gpu]
- driver: nvidia
device_ids: ['0', '2'] # Specific GPU IDs
capabilities: [gpu]
References: pkg/client/compose/service.go:204-226
Ulimits
Set user limits for containers:
services:
app:
image: myapp
ulimits:
nofile: # File descriptors
soft: 20000
hard: 40000
nproc: 65535 # Processes
References: pkg/client/compose/service.go:370-395
Restart policy
Uncloud automatically restarts containers with the unless-stopped policy. Custom restart policies are not currently supported.
All containers restart automatically unless you explicitly stop the service.
References: website/docs/8-compose-file-reference/1-support-matrix.md:50
Complete example
services:
# High-availability web service
web:
image: nginx:alpine
x-ports:
- example.com:80/https
deploy:
mode: replicated
replicas: 5
update_config:
order: start-first # Minimize downtime
monitor: 30s
resources:
limits:
cpus: '0.25'
memory: 128M
reservations:
memory: 64M
# API service with placement constraints
api:
image: myapi:latest
x-machines:
- api-server-1
- api-server-2
deploy:
mode: replicated
replicas: 6 # 3 per machine
update_config:
order: start-first
monitor: 60s
resources:
limits:
cpus: '1.0'
memory: 1G
reservations:
memory: 512M
# Stateful database
db:
image: postgres:16
volumes:
- db-data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=secret
x-machines: db-server # Pin to specific machine
deploy:
mode: replicated
replicas: 1
update_config:
order: stop-first # Prevent data corruption
monitor: 120s
resources:
limits:
cpus: '2.0'
memory: 4G
reservations:
memory: 2G
# Background worker
worker:
image: worker:latest
deploy:
mode: replicated
replicas: 10
resources:
limits:
cpus: '0.5'
memory: 512M
# GPU machine learning worker
ml-worker:
image: tensorflow/tensorflow:latest-gpu
x-machines:
- gpu-machine
deploy:
mode: replicated
replicas: 2
resources:
limits:
cpus: '4.0'
memory: 8G
reservations:
memory: 4G
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
# Monitoring on every machine
node-exporter:
image: prom/node-exporter
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
deploy:
mode: global # One per machine
resources:
limits:
cpus: '0.1'
memory: 64M
volumes:
db-data: