Health Checks

Health checks ensure that your services are running correctly and ready to handle requests. Convox supports multiple types of health checks including readiness probes, liveness probes, startup probes, and gRPC health checks.

Readiness Health Checks

Readiness health checks determine whether a service is ready to receive traffic. Services behind a load balancer require health checks to determine if a process can handle requests.

Processes that fail two consecutive health checks are assumed dead and will be terminated and replaced.

Simple Health Check

services:
  web:
    build: .
    port: 3000
    health: /check

health

string

Specifying health as a string sets the path and uses default values for other options

Advanced Health Check

services:
  web:
    build: .
    port: 3000
    health:
      grace: 10
      interval: 5
      path: /health
      timeout: 3

Health Check Parameters

health.path

string

default:"/"

The HTTP endpoint that will be requested for health checks

health.grace

number

default:"5"

The amount of time in seconds to wait for a process to boot before beginning health checks

health.interval

number

default:"5"

The number of seconds between health checks

health.timeout

number

default:"4"

The number of seconds to wait for a valid response (200-399 status code)

health.disable

boolean

default:"false"

Disable health checks for this service

Health checks must return a valid HTTP response code (200-399) within the configured timeout.

Liveness Checks

Liveness checks monitor the ongoing health of running processes. While readiness probes determine when a service is ready to receive traffic, liveness checks determine when a service should be restarted if it becomes unresponsive.

When a liveness check fails, Kubernetes will restart the container, which can help recover from deadlocks, memory leaks, or other issues.

Liveness Check Configuration

services:
  web:
    build: .
    port: 3000
    liveness:
      path: /liveness/check
      grace: 15
      interval: 5
      timeout: 3
      successThreshold: 1
      failureThreshold: 3

Liveness Check Parameters

liveness.path

string

required

Required. The HTTP endpoint that will be requested for liveness checks

liveness.grace

number

default:"10"

The amount of time in seconds to wait before beginning liveness checks

liveness.interval

number

default:"5"

The number of seconds between liveness checks

liveness.timeout

number

default:"5"

The number of seconds to wait for a successful response

liveness.successThreshold

number

default:"1"

The number of consecutive successful checks required

liveness.failureThreshold

number

default:"3"

The number of consecutive failed checks before restarting the container

Liveness Check Use Cases

Deadlock Detection
Memory Monitoring

services:
  worker:
    build: .
    liveness:
      path: /worker/health
      grace: 30
      interval: 10
      failureThreshold: 5

Detects and recovers from deadlocked worker processes.

services:
  processor:
    build: .
    liveness:
      path: /memory-check
      grace: 45
      interval: 15
      timeout: 10
      failureThreshold: 3

Monitors memory-intensive applications and restarts if unhealthy.

Liveness checks should be configured conservatively to avoid unnecessary restarts. False positives can cause service disruption.

Startup Probes

Startup probes check if an application has successfully started before allowing readiness and liveness probes to take effect. This is particularly useful for applications with long or variable startup times.

When a startup probe is configured, all other probes are disabled until it succeeds.

TCP Startup Probe

services:
  web:
    build: .
    port: 3000
    startupProbe:
      tcpSocketPort: 3000
      grace: 30
      interval: 10
      timeout: 5
      successThreshold: 1
      failureThreshold: 30

startupProbe.tcpSocketPort

number

required

Required. The TCP port to check for startup success

HTTP Startup Probe

services:
  api:
    build: .
    port: 8080
    startupProbe:
      path: /startup
      grace: 10
      interval: 5
      failureThreshold: 40

startupProbe.path

string

required

Required. The HTTP endpoint to check for startup success

Startup Probe Parameters

startupProbe.grace

number

default:"0"

The number of seconds to wait before starting startup checks

startupProbe.interval

number

default:"10"

The number of seconds between startup probe checks

startupProbe.timeout

number

default:"1"

The number of seconds to wait for a successful response

startupProbe.successThreshold

number

default:"1"

The number of consecutive successful checks required

startupProbe.failureThreshold

number

default:"3"

The number of consecutive failed checks before restarting the container

Startup Probe Use Cases

Database Migrations

Applications that run database migrations on startup

Cache Warming

Services that need to populate caches before serving traffic

Large Applications

Applications with significant initialization requirements

Configuration Loading

Services that load extensive configuration on startup

Complete Example with All Probes

services:
  analytics:
    build: .
    port: 5000
    startupProbe:
      tcpSocketPort: 5000
      grace: 60
      interval: 15
      failureThreshold: 20  # Allows up to 5 minutes for startup
    health:
      path: /health
      interval: 5
    liveness:
      path: /live
      interval: 10
      failureThreshold: 3

In this example, the startup probe allows up to 5 minutes (15s × 20) for the application to start. Once successful, health and liveness checks begin.

Version 3.19.7+ required for startup probes.

gRPC Health Checks

For services using gRPC, Convox provides support for gRPC health checks through the gRPC health checking protocol.

Basic gRPC Configuration

services:
  api:
    build: .
    port: grpc:50051
    grpcHealthEnabled: true

port

string

Must specify grpc: prefix for gRPC services

grpcHealthEnabled

boolean

default:"false"

Enables gRPC health checking protocol

Advanced gRPC Configuration

services:
  api:
    build: .
    port: grpc:50051
    grpcHealthEnabled: true
    health:
      grace: 20
      interval: 5
      path: /
      timeout: 2

The path attribute specifies the service name to check within your gRPC health implementation.

Implementation Requirements

Services must implement the gRPC Health Checking Protocol.

import (
	"context"
	
	"google.golang.org/grpc"
	"google.golang.org/grpc/health"
	healthpb "google.golang.org/grpc/health/grpc_health_v1"
)

func main() {
	server := grpc.NewServer()
	
	// Register your service
	// pb.RegisterYourServiceServer(server, &yourServiceImpl{})
	
	// Register the health service
	healthServer := health.NewServer()
	healthpb.RegisterHealthServer(server, healthServer)
	
	// Set service as serving
	healthServer.SetServingStatus("", healthpb.HealthCheckResponse_SERVING)
	
	// Continue with server initialization...
}

Probe Behavior

When grpcHealthEnabled is true, Convox configures:

Readiness Probe

Determines whether the service is ready to receive traffic

Liveness Probe

Monitors ongoing health and initiates restarts if necessary

Health Check Best Practices

Lightweight Checks

Keep health check endpoints fast and simple - avoid heavy operations

Dependency Checks

Include critical dependency checks (database, cache) in health endpoints

Separate Endpoints

Use different endpoints for readiness, liveness, and startup checks

Conservative Timeouts

Set appropriate grace periods and failure thresholds to avoid false positives

Health Check Timing

Fast Startup
Normal Startup
Slow Startup

services:
  web:
    health:
      grace: 5
      interval: 5
      timeout: 3
    liveness:
      grace: 10
      interval: 10

For applications that start quickly (< 10 seconds)

services:
  web:
    health:
      grace: 30
      interval: 5
      timeout: 5
    liveness:
      grace: 45
      interval: 15

For typical applications (10-60 seconds)

services:
  web:
    startupProbe:
      tcpSocketPort: 3000
      grace: 60
      interval: 15
      failureThreshold: 30
    health:
      grace: 10
      interval: 5
    liveness:
      grace: 20
      interval: 15

For applications with long initialization (> 60 seconds)

Troubleshooting

Service keeps restarting

Common causes:

Health check timeout too short
Health endpoint returning error status codes
Grace period too short for application startup
Check logs: convox logs -s <service>

Service not receiving traffic

Check that:

Health check endpoint returns 200-399 status code
Service is actually listening on configured port
Health check path is correct
Use convox ps to check service status

gRPC health checks failing

Verify:

port uses grpc: prefix
gRPC health protocol is implemented
Service is actually listening on the gRPC port
Health service is registered correctly

Version Requirements

Basic health checks

string

All versions

Liveness checks

string

All versions

Startup probes

string

Version 3.19.7+

gRPC health checks

string

All versions

Load Balancers

Configure traffic routing to healthy services

Service

Complete service configuration reference

Deployment

Understanding deployment and health checks

Getting Started

Installation

Cloud Providers

Configuration

Deployment

Management

​Health Checks

​Readiness Health Checks

​Simple Health Check

​Advanced Health Check

​Health Check Parameters

​Liveness Checks

​Liveness Check Configuration

​Liveness Check Parameters

​Liveness Check Use Cases

​Startup Probes

​TCP Startup Probe

​HTTP Startup Probe

​Startup Probe Parameters

​Startup Probe Use Cases

Database Migrations

Cache Warming

Large Applications

Configuration Loading

​Complete Example with All Probes

​gRPC Health Checks

​Basic gRPC Configuration

​Advanced gRPC Configuration

​Implementation Requirements

​Probe Behavior

Readiness Probe

Liveness Probe

​Health Check Best Practices

Lightweight Checks

Dependency Checks

Separate Endpoints

Conservative Timeouts

​Health Check Timing

​Troubleshooting

​Version Requirements

​Related Documentation

Load Balancers

Service

Deployment

Build docs developers (and LLMs) love

Health Checks

Readiness Health Checks

Simple Health Check

Advanced Health Check

Health Check Parameters

Liveness Checks

Liveness Check Configuration

Liveness Check Parameters

Liveness Check Use Cases

Startup Probes

TCP Startup Probe

HTTP Startup Probe

Startup Probe Parameters

Startup Probe Use Cases

Complete Example with All Probes

gRPC Health Checks

Basic gRPC Configuration

Advanced gRPC Configuration

Implementation Requirements

Probe Behavior

Health Check Best Practices

Health Check Timing

Troubleshooting

Version Requirements

Related Documentation