Skip to main content
Uncloud updates your services without downtime by replacing containers one at a time. Each new container must pass health checks before the deployment continues, ensuring your service stays available throughout the update.

How rolling updates work

When you run uc deploy, Uncloud updates containers sequentially:
1

Start new container

Launch a container with the new configuration or image.
2

Wait for health

Monitor the container for 5 seconds (or until it becomes healthy).
3

Stop old container

Once the new container is healthy, stop and remove the old one.
4

Repeat

Move to the next container and repeat the process.
For a service with 3 replicas using the default start-first order:
  1. Start new container 1, wait until healthy
  2. Stop and remove old container 1
  3. Start new container 2, wait until healthy
  4. Stop and remove old container 2
  5. Start new container 3, wait until healthy
  6. Stop and remove old container 3
At every step, at least 3 containers are serving traffic.

Update order

The update order controls whether the new container starts before or after stopping the old one.

start-first (default)

Start the new container before stopping the old one:
  • Pros: Zero downtime, service always available
  • Cons: Briefly runs both containers simultaneously
  • Best for: Stateless services (web apps, APIs, workers)
services:
  web:
    image: myapp:v2
    deploy:
      update_config:
        order: start-first

stop-first

Stop the old container before starting the new one:
  • Pros: Only one container runs at a time, prevents data corruption
  • Cons: Brief downtime during the transition
  • Best for: Stateful services (databases, services with volumes)
services:
  db:
    image: postgres:16
    volumes:
      - db-data:/var/lib/postgresql/data
    deploy:
      update_config:
        order: stop-first

volumes:
  db-data:

Automatic order selection

Uncloud automatically chooses the update order based on your service configuration:
ScenarioOrderReason
Defaultstart-firstMinimize downtime
Host port conflictsstop-firstOld container must free the port
Single replica + volumestop-firstPrevent concurrent writes to same volume
Multi-replica + volumestart-firstConcurrent access already happening
The deployment plan shows which order will be used for each container.

Health monitoring

After starting each new container, Uncloud monitors it for failures.

Default monitoring period

By default, Uncloud waits 5 seconds and checks that the container:
  • Stays running (doesn’t crash)
  • Doesn’t restart repeatedly
  • Becomes healthy if it has a health check
If the container fails during this period, Uncloud rolls back and stops the deployment.

Change monitoring period

Adjust the monitoring period for your service:
services:
  app:
    image: myapp:latest
    deploy:
      update_config:
        # Wait 10 seconds before considering the container healthy
        monitor: 10s
Use a longer period if your app takes time to initialize.

Skip monitoring

Set to 0s to skip monitoring entirely:
services:
  app:
    image: myapp:latest
    deploy:
      update_config:
        monitor: 0s  # Skip health monitoring
Or use the --skip-health flag:
uc deploy --skip-health
Skipping health monitoring won’t detect containers that crash on startup. Use only for emergency deployments when you’re confident the new version works.

Global default

Change the default monitoring period for all services:
export UNCLOUD_HEALTH_MONITOR_PERIOD=10s
uc deploy
Per-service monitor settings override the global default.

Health checks

Configure health checks to make deployments safer and faster.

Why use health checks

  • Faster deployments: Container marked healthy as soon as the check passes
  • Safer rollouts: Detect broken deployments before they affect traffic
  • Automatic recovery: Unhealthy containers removed from load balancing

Configure in Compose file

services:
  app:
    image: myapp:latest
    healthcheck:
      # Command to check health
      test: curl -f http://localhost:8000/health || exit 1
      # Check every 5 seconds
      interval: 5s
      # Timeout for each check
      timeout: 3s
      # Number of consecutive failures before unhealthy
      retries: 3
      # Wait 10s after container starts before first check
      start_period: 10s
      # Check every 1s during start_period
      start_interval: 1s

Configure in Dockerfile

FROM node:20-alpine

WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .

HEALTHCHECK --interval=10s --timeout=3s --start-period=30s \
  CMD node healthcheck.js || exit 1

CMD ["node", "server.js"]

Health check behavior during deployment

  1. Container starts
  2. Health check runs according to interval (or start_interval during start_period)
  3. If container becomes healthy before monitoring period ends, deployment succeeds early
  4. If container is unhealthy after monitoring period, Uncloud rolls back
  5. Transient unhealthy states during monitoring are tolerated

Health check formats

Three formats are supported: Shell command:
healthcheck:
  test: curl -f http://localhost/health || exit 1
Exec array:
healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost/health"]
Disable inherited health check:
healthcheck:
  disable: true

Example health check endpoints

Node.js (Express):
app.get('/health', (req, res) => {
  // Check database connection
  if (!db.isConnected()) {
    return res.status(503).send('Database unavailable');
  }
  res.status(200).send('OK');
});
Python (FastAPI):
@app.get("/health")
async def health():
    # Check dependencies
    if not await redis.ping():
        raise HTTPException(status_code=503, detail="Redis unavailable")
    return {"status": "healthy"}
Go:
http.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
    // Check database connection
    if err := db.Ping(); err != nil {
        w.WriteHeader(http.StatusServiceUnavailable)
        return
    }
    w.WriteHeader(http.StatusOK)
    w.Write([]byte("OK"))
})

Rollback on failure

If a new container fails health monitoring, Uncloud automatically rolls back:
1

Stop new container

The failed container is stopped but kept for inspection.
2

Restart old container (stop-first only)

For stop-first order, the old container is restarted.
3

Halt deployment

The deployment stops. Remaining containers keep their current version.

Inspect failed containers

After a failed deployment:
# View all containers including stopped ones
uc ps -a

# Check logs of failed container
uc logs <container-id>

# Inspect container state
uc inspect <container-id>

Retry after failure

Fix the issue and run uc deploy again. Uncloud skips successfully updated containers and only redeploys the remaining ones.

Update strategies for different scenarios

Stateless web application

services:
  web:
    image: myapp:v2
    scale: 5
    x-ports:
      - app.example.com:8000/https
    healthcheck:
      test: curl -f http://localhost:8000/health
      interval: 5s
      retries: 3
    # start-first is automatic (default)
Result: Zero downtime, all 5 replicas updated one by one.

Stateful database

services:
  postgres:
    image: postgres:16
    volumes:
      - db-data:/var/lib/postgresql/data
    deploy:
      update_config:
        order: stop-first  # Prevent data corruption

volumes:
  db-data:
Result: Brief downtime while old container stops and new one starts.

Single replica with volume

services:
  app:
    image: myapp:v2
    scale: 1
    volumes:
      - app-data:/data
    # stop-first is automatic (single replica + volume)

volumes:
  app-data:
Result: Automatic stop-first to prevent concurrent writes to volume.

Multi-replica with volume (safe concurrent access)

services:
  app:
    image: myapp:v2
    scale: 3
    volumes:
      - shared-data:/data
    deploy:
      update_config:
        order: start-first  # Override automatic stop-first

volumes:
  shared-data:
Result: Zero downtime if your app handles concurrent access safely (e.g., SQLite with WAL mode).

Host port binding

services:
  metrics:
    image: prometheus:latest
    x-ports:
      - 9090:9090/tcp@host
    # stop-first is automatic (port conflict)
Result: Automatic stop-first because old container must free port 9090 before new one can bind.

ServiceSpec and UpdateConfig reference

Based on the source code, here are the key structures:

ServiceSpec fields

type ServiceSpec struct {
    Name             string
    Mode             string           // "replicated" or "global"
    Replicas         uint             // For replicated mode
    Container        ContainerSpec
    Ports            []PortSpec
    Volumes          []VolumeSpec
    UpdateConfig     UpdateConfig
    StopGracePeriod  *time.Duration   // Default: 10 seconds
    // ... other fields
}

UpdateConfig fields

type UpdateConfig struct {
    // Order: "start-first" or "stop-first"
    // Empty means automatic selection based on service characteristics
    Order string `json:",omitempty"`

    // MonitorPeriod: How long to wait after starting a container
    // before checking it's still running
    // nil = use default (5s), 0 = skip monitoring
    MonitorPeriod *time.Duration `json:",omitempty"`
}

Stop grace period

Time to wait after SIGTERM before sending SIGKILL:
services:
  app:
    image: myapp:latest
    stop_grace_period: 30s  # Wait 30s for graceful shutdown
Default is 10 seconds if not specified.

Advanced scenarios

Canary deployments

Deploy to a subset of machines first:
services:
  web-canary:
    image: myapp:v2
    scale: 1
    x-machines:
      - canary-server

  web:
    image: myapp:v1
    scale: 5
    x-machines:
      - server-1
      - server-2
      - server-3
      - server-4
      - server-5
Monitor the canary, then update the main service if successful.

Forced recreation

Force container recreation even if nothing changed:
uc deploy --recreate
Useful for:
  • Picking up external volume changes
  • Resetting container state
  • Testing deployment process

Emergency rollback

Quickly roll back to previous version:
# Update image tag to previous version
uc deploy -f compose.v1.yaml --skip-health
The --skip-health flag speeds up the rollback but skips health checks.

Monitoring deployments

Watch deployment progress

uc deploy shows real-time progress:
Deployment plan
- Deploy service [name=web]
  - Replace container [id=a1b2c3, machine=server-1, order=start-first]
  - Replace container [id=d4e5f6, machine=server-2, order=start-first]
  - Replace container [id=g7h8i9, machine=server-3, order=start-first]

Do you want to continue? (y/N): y

Deploying services ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:45

Check service health after deployment

# List all services
uc ls

# Check specific service
uc inspect web

# View container logs
uc logs web

# Monitor in real-time
uc logs -f web

Verify all containers are healthy

uc ps
Look for the “healthy” status in the output.

Best practices

Health checks make deployments safer and faster. Every service should have one:
healthcheck:
  test: curl -f http://localhost/health || exit 1
  interval: 10s
  timeout: 3s
  retries: 3
  start_period: 30s
Let Uncloud choose automatically unless you have specific requirements:
  • Stateless services: start-first (automatic)
  • Stateful services: stop-first (automatic for volumes)
  • Host ports: stop-first (automatic)
Always test deployments in a staging environment before production:
# Deploy to staging
uc deploy -f compose.staging.yaml

# Verify everything works

# Deploy to production
uc deploy -f compose.prod.yaml
Watch logs and metrics during deployment:
# Terminal 1: Deploy
uc deploy

# Terminal 2: Watch logs
uc logs -f web
Keep previous versions available for quick rollback:
# Tag images with version numbers
services:
  web:
    image: myapp:v1.2.3

Next steps

Health Checks

Deep dive into container health checks

Scaling

Scale services horizontally

Docker Compose

Learn about Compose file features

Deploying Services

Deploy with uc run

Build docs developers (and LLMs) love