Deploy Configuration

The deploy section configures how services are deployed across your cluster. This includes replica count, resource limits, placement, and update behavior.

Deployment mode

Services can run in replicated or global mode:

services:
  # Replicated: specific number of containers
  web:
    image: nginx
    deploy:
      mode: replicated
      replicas: 3

  # Global: one container per machine
  monitoring:
    image: prometheus/node-exporter
    deploy:
      mode: global

Modes:

replicated - Run a specific number of replicas (default)
global - Run one container on every machine in the cluster

References: pkg/api/service.go:19-21, pkg/client/compose/service.go:98-107

Replicas

Set the number of container replicas for replicated services:

services:
  api:
    image: myapi
    deploy:
      replicas: 5  # Run 5 containers

You can also use the top-level scale key:

services:
  worker:
    image: worker
    scale: 10  # Shorthand for deploy.replicas

If both scale and deploy.replicas are set, deploy.replicas takes precedence.

Default is 1 replica if not specified. References: pkg/api/service.go:70-71, pkg/client/compose/service.go:93-104

Placement constraints

Control which machines can run your service using the x-machines extension:

services:
  gpu-worker:
    image: ml-worker
    x-machines:
      - gpu-machine-1
      - gpu-machine-2
    deploy:
      replicas: 4  # Distributed across specified machines

You can specify machine names or IDs:

services:
  db:
    image: postgres:16
    x-machines: db-server  # Single machine (string)
    deploy:
      replicas: 1

Format:

String: x-machines: machine-name (single machine)
Array: x-machines: [machine-1, machine-2] (multiple machines)

If a service has replicas > 1 and uses x-machines, Uncloud spreads replicas across the specified machines.

References: pkg/api/service.go:65-66, pkg/client/compose/service.go:76-78, test/e2e/compose_deploy_test.go:390-489

Update configuration

Configure rolling update behavior:

services:
  web:
    image: nginx
    deploy:
      replicas: 4
      update_config:
        order: start-first
        monitor: 30s

References: pkg/api/service.go:441-452

Update order

Control the order of container replacement during updates:

services:
  # Stateless service: minimize downtime
  api:
    image: myapi
    deploy:
      update_config:
        order: start-first  # Start new before stopping old

  # Stateful service: prevent data corruption
  db:
    image: postgres:16
    volumes:
      - db-data:/var/lib/postgresql/data
    deploy:
      update_config:
        order: stop-first  # Stop old before starting new

Update orders:

start-first - Start new container before stopping old (default for stateless)
- Minimizes downtime
- Briefly runs both containers
stop-first - Stop old container before starting new (default for services with volumes)
- Prevents data corruption
- Causes brief downtime

References: pkg/api/service.go:23-28, pkg/client/compose/service.go:109-120

Monitor period

How long to monitor containers after starting to ensure they’re healthy:

services:
  app:
    image: myapp
    healthcheck:
      test: ["CMD", "curl", "http://localhost/health"]
      interval: 10s
    deploy:
      update_config:
        monitor: 60s  # Wait up to 60s for health check

The update waits for:

Container to pass health check, or
Monitor period to elapse and container still running

Default monitor period is used if not specified. References: pkg/api/service.go:447-451, pkg/client/compose/service.go:121-123

Resource limits

Set CPU and memory limits for containers:

services:
  app:
    image: myapp
    deploy:
      resources:
        limits:
          cpus: '1.5'        # 1.5 CPU cores
          memory: 1G         # 1 gigabyte RAM
        reservations:
          memory: 512M       # Reserve 512 megabytes

You can also use top-level keys:

services:
  app:
    image: myapp
    cpus: 0.5           # Shorthand for deploy.resources.limits.cpus
    mem_limit: 256M     # Shorthand for deploy.resources.limits.memory
    mem_reservation: 128M  # Shorthand for deploy.resources.reservations.memory

References: pkg/client/compose/service.go:175-228

CPU limits

Limit CPU usage:

services:
  cpu-intensive:
    image: myapp
    deploy:
      resources:
        limits:
          cpus: '2.0'  # Use up to 2 CPU cores

Format: String or number representing CPU cores (e.g., '0.5', '2.0')

Memory limits

Limit memory usage:

services:
  memory-intensive:
    image: myapp
    deploy:
      resources:
        limits:
          memory: 2G         # 2 gigabytes maximum
        reservations:
          memory: 1G         # 1 gigabyte guaranteed

Formats: 100M, 1G, 512m, etc.

If a container exceeds its memory limit, Docker kills and restarts it.

Device reservations

Reserve GPU and other devices:

services:
  ml-worker:
    image: tensorflow/tensorflow:latest-gpu
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1          # Number of GPUs
              capabilities: [gpu]
            
            - driver: nvidia
              device_ids: ['0', '2']  # Specific GPU IDs
              capabilities: [gpu]

References: pkg/client/compose/service.go:204-226

Ulimits

Set user limits for containers:

services:
  app:
    image: myapp
    ulimits:
      nofile:           # File descriptors
        soft: 20000
        hard: 40000
      nproc: 65535      # Processes

References: pkg/client/compose/service.go:370-395

Restart policy

Uncloud automatically restarts containers with the unless-stopped policy. Custom restart policies are not currently supported.

All containers restart automatically unless you explicitly stop the service.

References: website/docs/8-compose-file-reference/1-support-matrix.md:50

Complete example

services:
  # High-availability web service
  web:
    image: nginx:alpine
    x-ports:
      - example.com:80/https
    deploy:
      mode: replicated
      replicas: 5
      update_config:
        order: start-first  # Minimize downtime
        monitor: 30s
      resources:
        limits:
          cpus: '0.25'
          memory: 128M
        reservations:
          memory: 64M

  # API service with placement constraints
  api:
    image: myapi:latest
    x-machines:
      - api-server-1
      - api-server-2
    deploy:
      mode: replicated
      replicas: 6  # 3 per machine
      update_config:
        order: start-first
        monitor: 60s
      resources:
        limits:
          cpus: '1.0'
          memory: 1G
        reservations:
          memory: 512M

  # Stateful database
  db:
    image: postgres:16
    volumes:
      - db-data:/var/lib/postgresql/data
    environment:
      - POSTGRES_PASSWORD=secret
    x-machines: db-server  # Pin to specific machine
    deploy:
      mode: replicated
      replicas: 1
      update_config:
        order: stop-first  # Prevent data corruption
        monitor: 120s
      resources:
        limits:
          cpus: '2.0'
          memory: 4G
        reservations:
          memory: 2G

  # Background worker
  worker:
    image: worker:latest
    deploy:
      mode: replicated
      replicas: 10
      resources:
        limits:
          cpus: '0.5'
          memory: 512M

  # GPU machine learning worker
  ml-worker:
    image: tensorflow/tensorflow:latest-gpu
    x-machines:
      - gpu-machine
    deploy:
      mode: replicated
      replicas: 2
      resources:
        limits:
          cpus: '4.0'
          memory: 8G
        reservations:
          memory: 4G
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

  # Monitoring on every machine
  node-exporter:
    image: prom/node-exporter
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
    deploy:
      mode: global  # One per machine
      resources:
        limits:
          cpus: '0.1'
          memory: 64M

volumes:
  db-data:

Services - Service configuration
Volumes - Volume configuration
Support Matrix - Feature support status

Compose File

Deploy Configuration

Deployment mode

Replicas

Placement constraints

Update configuration

Update order

Monitor period

Resource limits

CPU limits

Memory limits

Device reservations

Ulimits

Restart policy

Complete example

Build docs developers (and LLMs) love

Compose File

​Deployment mode

​Replicas

​Placement constraints

​Update configuration

​Update order

​Monitor period

​Resource limits

​CPU limits

​Memory limits

​Device reservations

​Ulimits

​Restart policy

​Complete example

​Related topics

Build docs developers (and LLMs) love

Deployment mode

Replicas

Placement constraints

Update configuration

Update order

Monitor period

Resource limits

CPU limits

Memory limits

Device reservations

Ulimits

Restart policy

Complete example

Related topics