Overview
Carrier provides built-in health check monitoring for webhook endpoints. When enabled, Carrier will:- Wait for the webhook endpoint to come online before processing messages
- Continuously monitor the webhook’s health status
- Automatically exit if the webhook goes offline, allowing orchestration systems like Kubernetes to restart the container
Configuration
Health checks are configured through environment variables:| Variable | Required | Default | Description |
|---|---|---|---|
CARRIER_WEBHOOK_HEALTH_CHECK_ENDPOINT | No | - | Enables health checks when set. Should be a full URL to your health check endpoint |
CARRIER_WEBHOOK_HEALTH_CHECK_INTERVAL | No | 60s | Time interval between health checks |
CARRIER_WEBHOOK_HEALTH_CHECK_TIMEOUT | No | 10s | Timeout for each health check request |
CARRIER_WEBHOOK_OFFLINE_THRESHOLD_COUNT | No | 5 | Number of consecutive failed checks before marking webhook as offline |
All time duration values support Go’s
time.ParseDuration() format (e.g., 30s, 2m, 1h30m).Example Configuration
Docker Compose
Kubernetes
How It Works
Health Check Mechanism
The health checker (transmitter/webhook/checker.go:76-93) performs periodic GET requests to the configured endpoint:
The health checker only considers HTTP 200 responses as healthy. Any other status code or network error marks the endpoint as offline.
State Transitions
Carrier manages two endpoint states defined intransmitter/webhook/checker.go:13-16:
Startup Sequence
- Initial State: Carrier starts with the endpoint in
EndpointStateOffline - Waiting: Health checks run continuously until the endpoint returns HTTP 200
- Online: Once healthy, Carrier logs “webhook online” and begins processing messages from SQS
- Ready: The message “carrier has arrived” indicates the system is fully operational
Runtime Monitoring
Once online, the health checker:- Resets the offline counter on each successful check
- Increments the counter on each failed check
- Marks the endpoint offline after reaching the threshold
- Signals Carrier to exit, triggering a container restart
Implementing Health Check Endpoints
Best Practices
Keep It Simple
Health checks should be fast and lightweight. Avoid database queries or external API calls.
Check Dependencies
Verify that critical dependencies your webhook needs are available.
Return Quickly
Respond within the configured timeout (default 10s). Faster is better.
Use Standard Codes
Return HTTP 200 for healthy, anything else for unhealthy.
Example Implementations
Monitoring with StatLogger
When health checks are enabled, you can monitor their operation using Carrier’s statistics logging feature. Enable it by setting:main.go:42-78) tracks goroutines and memory usage:
Troubleshooting
Carrier exits immediately on startup
Carrier exits immediately on startup
- Verify your health check endpoint is accessible from Carrier
- Check that the endpoint returns HTTP 200
- Review logs for connection errors
- Ensure the endpoint URL is correct (protocol, host, port, path)
Health checks are too sensitive
Health checks are too sensitive
Increase the
CARRIER_WEBHOOK_OFFLINE_THRESHOLD_COUNT to allow more consecutive failures before marking the endpoint offline. Default is 5 failures.Health checks are not frequent enough
Health checks are not frequent enough
Decrease the
CARRIER_WEBHOOK_HEALTH_CHECK_INTERVAL value. Default is 60 seconds. For faster detection, try 30s or 15s.Health checks timing out
Health checks timing out
- Check network latency between Carrier and the webhook
- Optimize your health check endpoint to respond faster
- Increase
CARRIER_WEBHOOK_HEALTH_CHECK_TIMEOUTif necessary
Kubernetes Integration
When running Carrier in Kubernetes, health checks work seamlessly with pod lifecycle management:Related Topics
Monitoring
Learn about logging and monitoring Carrier in production
Configuration
Complete reference for all Carrier environment variables
