Unhealthy and automatically terminates and replaces it.
How Health Checking Works
Agones uses a streaming health check model where your game server sends periodic health pings to the sidecar:SDK Establishes Stream
When your game server connects to the SDK, it opens a gRPC health check stream to the sidecar.
Regular Health Pings
Your game server sends
Health() calls at regular intervals (typically every 2-5 seconds).Sidecar Monitors
The sidecar tracks the time since the last health ping. If no ping is received within the configured period, it reports a failure.

The Health() Method
All Agones SDKs provide aHealth() method that sends a health ping to the sidecar:
Health Configuration
Configure health checking in your GameServer manifest underspec.health:
gameserver.yaml
Configuration Options
Whether health checking is disabled. If
true, the server will never be marked as Unhealthy due to missing health pings.How often (in seconds) to check if a health ping was received. If no ping is received within this period, it counts as one failure.Recommendation: Set this to 2-3x your health ping interval. If you send pings every 2 seconds, use
periodSeconds: 5 or periodSeconds: 6.Number of consecutive failed health checks before marking the GameServer as Unhealthy.Total grace period =
periodSeconds × failureThresholdExample: With periodSeconds: 5 and failureThreshold: 3, your server has 15 seconds to send a health ping before being marked Unhealthy.How long (in seconds) to wait after the container starts before beginning health checks. This gives your server time to initialize.Important: If your server doesn’t call
Ready() within initialDelaySeconds + (periodSeconds × failureThreshold), it will be marked Unhealthy.Configuration Examples
Fast Health Checking
For servers that initialize quickly and need fast failure detection:- Health pings expected every 3 seconds
- Marked Unhealthy after 2 failures (6 seconds)
- 5 seconds grace period at startup
- Total grace time: 5 + (3 × 2) = 11 seconds
Slow Initialization
For servers that take time to load assets or initialize:- Health pings expected every 5 seconds
- Marked Unhealthy after 3 failures (15 seconds)
- 30 seconds grace period at startup
- Total grace time: 30 + (5 × 3) = 45 seconds
Relaxed Health Checking
For stable servers where occasional delays are acceptable:- Health pings expected every 10 seconds
- Marked Unhealthy after 5 failures (50 seconds)
- 10 seconds grace period at startup
- Total grace time: 10 + (10 × 5) = 60 seconds
Disabled Health Checking
- Your server never needs to send health pings
- It will never be marked Unhealthy
- You’re responsible for detecting and handling server failures
Best Practices
Send Regular Pings
Send health pings at regular intervals (every 2-5 seconds). Don’t wait for the health check period to elapse.
Start Early
Start health checking immediately after SDK connection, even before calling
Ready().Use Background Tasks
Run health checking in a separate thread/goroutine/task so it doesn’t block your game logic.
Handle Errors
Log errors when health pings fail. This can indicate SDK connection issues.
Recommended Health Ping Interval
Advanced Patterns
Conditional Health Checking
Only send health pings when the server is in a healthy state:This pattern is useful when you want to intentionally fail health checks to trigger server replacement on critical errors.
Health Checking with Retries
Retry failed health pings before giving up:Monitoring Health Check Status
Track health check success/failure metrics:Troubleshooting
Server marked as Unhealthy immediately
Server marked as Unhealthy immediately
Symptoms: Server transitions to Unhealthy right after starting.Causes:
initialDelaySecondsis too short- Server initialization takes longer than expected
- Health pings not starting before
Ready()call
- Increase
initialDelaySecondsto give more startup time - Start health checking immediately after SDK connection
- Ensure health check loop starts before long initialization tasks
Intermittent Unhealthy status
Intermittent Unhealthy status
Symptoms: Server occasionally becomes Unhealthy but recovers.Causes:
- Health ping interval too slow for configured
periodSeconds - Network delays between container and sidecar
- Game logic blocking health check thread
- Increase
periodSecondsto allow more time - Increase
failureThresholdfor more tolerance - Send health pings more frequently
- Ensure health checking runs in separate thread/task
Health pings failing with SDK errors
Health pings failing with SDK errors
Symptoms:
Health() calls return errors.Causes:- SDK not connected to sidecar
- Sidecar container not running
- Network issues in pod
- gRPC stream closed
- Verify SDK connection before starting health checks
- Check that GameServer manifest includes Agones sidecar
- Examine pod logs for sidecar errors
- Implement retry logic for failed health pings
Server never becomes Unhealthy despite crashes
Server never becomes Unhealthy despite crashes
Symptoms: Crashed server stays in Ready/Allocated state.Causes:
- Health checking disabled in spec
- Health check loop continues after crash (unlikely)
- Container doesn’t exit on crash
- Ensure
spec.health.disabledisfalseor omitted - Configure container to exit on critical errors
- Add liveness probes if needed
Health Checking vs Kubernetes Probes
Agones health checking is separate from Kubernetes liveness and readiness probes:
Next Steps
Lifecycle Management
Learn about Ready, Shutdown, and state transitions
SDK Overview
Explore other SDK features
Troubleshooting
Debug common integration issues
Metrics
Monitor server health with metrics
