Architecture
Multi-region monitoring works by deploying multiple scheduler instances:- Runs all monitors on schedule
- Tags check results with its region
- Evaluates alert conditions based on region thresholds
Deploying Schedulers in Multiple Regions
PONGO_REGION Environment Variable
ThePONGO_REGION variable identifies each scheduler instance:
- Tags all check results in the database
- Appears in webhook payloads
- Controls alert threshold logic
- Displays in the dashboard UI
Region Thresholds on Alerts
Configure how many regions must trigger an alert before notifications are sent:Threshold Options
| Value | Description | Example (3 regions) |
|---|---|---|
"any" | Fire if any region triggers (default) | 1 region fails → alert fires |
"majority" | Fire if >50% of regions trigger | 2 regions fail → alert fires |
"all" | Fire only if all regions trigger | 3 regions fail → alert fires |
number | Fire if N or more regions trigger | 2: 2 regions fail → alert fires |
Examples
Alert on Any Region Failure
Default behavior - immediate alerting:Alert on Majority Failure
Reduce false positives from regional network issues:Alert on All Regions
Only alert for complete global outages:Alert on Specific Count
Custom threshold for fine-grained control:Alert Payload Region Fields
Webhook payloads include region information:Region Fields
| Field | Type | Description |
|---|---|---|
region | string | Region that triggered this webhook |
firingRegions | string[] | All regions where alert is firing |
healthyRegions | string[] | Regions where monitor is healthy |
Deployment Examples
Fly.io Multi-Region
Deploy schedulers to multiple Fly.io regions:Docker Compose Multi-Region
Kubernetes Multi-Region
Best Practices
Region Selection
- Deploy to regions where your users are located
- Include at least 3 regions for meaningful majority threshold
- Consider latency between scheduler and monitored service
- Test from regions with different network paths
Alert Configuration
- Use
"any"for critical, user-facing services (fast alerting) - Use
"majority"for internal services (reduce false positives) - Use
"all"for alerts about global infrastructure - Combine multiple alerts with different thresholds:
Database Configuration
- Use PostgreSQL for multi-region deployments (better concurrency)
- Ensure database is accessible from all scheduler regions
- Consider connection pooling (e.g., PgBouncer) for many schedulers
- Monitor database latency from each region
Monitoring Regions
Check scheduler health endpoints to verify region deployment:Troubleshooting
Regions Not Appearing
If regions aren’t showing up in the dashboard:- Verify
PONGO_REGIONis set correctly - Check scheduler logs for startup messages
- Confirm all schedulers connect to the same database
- Verify schedulers are running and executing monitors
Alert Not Firing
If multi-region alerts aren’t triggering:- Check that enough regions meet the threshold
- Verify
regionThresholdconfiguration - Review alert conditions (e.g.,
consecutiveFailures) - Check webhook payload for
firingRegionsandhealthyRegions
Inconsistent Results
If different regions show different results:- This is expected - regions may see different network conditions
- Adjust
regionThresholdto account for variability - Increase
consecutiveFailuresto smooth out transient issues - Consider regional CDN or load balancer behavior