Scheduler Setup

The scheduler is a standalone Bun process that runs monitors on schedule, evaluates alert conditions, and dispatches webhooks with retry logic.

Overview

The scheduler:

Executes monitors based on their interval or cron schedule
Evaluates alert conditions and fires/resolves alerts
Dispatches webhooks to notification channels with exponential backoff retry
Provides an HTTP API for health checks and manual triggers
Supports multi-region deployments

Running the Scheduler

Standalone Process

Start the scheduler in a separate terminal from your main application:

bun scheduler

The scheduler will:

Connect to your database (using DATABASE_URL)
Load all monitor configurations from pongo/monitors/
Start the HTTP API server (default port 3001)
Begin executing monitors according to their schedules

Docker

The Docker entrypoint can auto-start the scheduler:

SCHEDULER_ENABLED=true docker run pongo

Or in your Dockerfile/docker-compose.yml:

docker-compose.yml

services:
  scheduler:
    build: .
    command: ["bun", "scheduler"]
    environment:
      DATABASE_URL: postgres://postgres:password@db:5432/pongo
      PONGO_REGION: us-east
      SCHEDULER_PORT: 3001
    depends_on: [db]

Fly.io

The included fly.toml and Dockerfile handle scheduler deployment:

fly secrets set SCHEDULER_ENABLED=true
fly deploy

HTTP API

The scheduler exposes an HTTP API for monitoring and control.

Endpoints

Endpoint	Method	Description
`/health`	GET	Health check with region info and uptime
`/monitors`	GET	List all monitors with current state
`/monitors/:id`	GET	Get single monitor state and statistics
`/monitors/:id/trigger`	POST	Trigger a specific monitor immediately
`/monitors/trigger`	POST	Trigger all monitors immediately

Health Check Example

curl http://localhost:3001/health

{
  "status": "healthy",
  "region": "us-east",
  "uptime": 86400,
  "monitorsLoaded": 12
}

Manual Trigger Example

curl -X POST http://localhost:3001/monitors/api-health/trigger

{
  "success": true,
  "monitorId": "api-health",
  "triggeredAt": "2025-12-08T10:00:00Z"
}

Environment Variables

Core Configuration

SCHEDULER_PORT

number

default:"3001"

HTTP API server port.

SCHEDULER_PORT=3001

SCHEDULER_ENABLED

boolean

default:"false"

Auto-start scheduler in Docker entrypoint.

SCHEDULER_ENABLED=true

SCHEDULER_URL

string

Scheduler service URL. Set this in your main app to enable manual monitor runs from the dashboard UI.

SCHEDULER_URL=http://scheduler:3001

ENABLE_MANUAL_RUN

boolean

default:"false"

Show manual run button in dashboard UI. Requires SCHEDULER_URL to be set.

ENABLE_MANUAL_RUN=true

Concurrency and Performance

SCHEDULER_MAX_CONCURRENCY

number

default:"10"

Maximum number of monitors to execute in parallel.Increase for large monitor fleets:

SCHEDULER_MAX_CONCURRENCY=50

Retry Configuration

SCHEDULER_MAX_RETRIES

number

default:"3"

Number of retry attempts for failed monitor checks.

SCHEDULER_MAX_RETRIES=5

SCHEDULER_RETRY_DELAY_MS

number

default:"5000"

Base retry delay in milliseconds. Uses exponential backoff:

Attempt 1: SCHEDULER_RETRY_DELAY_MS
Attempt 2: SCHEDULER_RETRY_DELAY_MS * 2
Attempt 3: SCHEDULER_RETRY_DELAY_MS * 4

SCHEDULER_RETRY_DELAY_MS=10000  # 10s, 20s, 40s backoff

Multi-Region Setup

Deploy multiple scheduler instances across regions for global monitoring:

# Region 1 (US East)
DATABASE_URL=postgres://user:[email protected]:5432/pongo
PONGO_REGION=us-east
bun scheduler

# Region 2 (EU West)
DATABASE_URL=postgres://user:[email protected]:5432/pongo
PONGO_REGION=eu-west
bun scheduler

# Region 3 (Asia Pacific)
DATABASE_URL=postgres://user:[email protected]:5432/pongo
PONGO_REGION=ap-south
bun scheduler

Region Identifier

PONGO_REGION

string

default:"default"

Region identifier for multi-region deployments. Used in:

Alert region thresholds (regionThreshold in monitor config)
Health check responses
Check result metadata

PONGO_REGION=us-east

Region-Aware Alerts

Configure alerts to fire based on regional thresholds:

export default monitor({
  name: "API Health",
  interval: "1m",
  alerts: [
    {
      id: "api-down",
      name: "API Down",
      condition: { consecutiveFailures: 3 },
      channels: ["slack"],
      // Fire if ANY region reports failure
      regionThreshold: "any",
    },
    {
      id: "api-degraded",
      name: "API Degraded",
      condition: { status: "degraded", forChecks: 5 },
      channels: ["slack"],
      // Fire only if MAJORITY of regions report degraded
      regionThreshold: "majority",
    },
  ],
  async handler() {
    // ...
  },
});

Region threshold options:

"any" - Alert fires if any region meets condition
"all" - Alert fires only if all regions meet condition
"majority" - Alert fires if majority of regions meet condition
number - Alert fires if at least N regions meet condition

Complete Configuration Example

Development (Single Region)

.env

# Database
DATABASE_URL=file:./pongo/pongo.db

# Scheduler
SCHEDULER_PORT=3001
SCHEDULER_MAX_CONCURRENCY=10
SCHEDULER_MAX_RETRIES=3
SCHEDULER_RETRY_DELAY_MS=5000

Start both services:

bun dev              # Terminal 1 - Dashboard
bun scheduler        # Terminal 2 - Scheduler

Production (Multi-Region)

.env.us-east

# Database (shared)
DATABASE_URL=postgres://user:[email protected]:5432/pongo

# Region
PONGO_REGION=us-east

# Scheduler
SCHEDULER_PORT=3001
SCHEDULER_MAX_CONCURRENCY=50
SCHEDULER_MAX_RETRIES=5
SCHEDULER_RETRY_DELAY_MS=10000

.env.eu-west

# Database (shared)
DATABASE_URL=postgres://user:[email protected]:5432/pongo

# Region
PONGO_REGION=eu-west

# Scheduler
SCHEDULER_PORT=3001
SCHEDULER_MAX_CONCURRENCY=50
SCHEDULER_MAX_RETRIES=5
SCHEDULER_RETRY_DELAY_MS=10000

Docker Compose (Single Region)

docker-compose.yml

services:
  pongo:
    build: .
    ports: ["3000:3000"]
    environment:
      DATABASE_URL: postgres://postgres:password@db:5432/pongo
      SCHEDULER_URL: http://scheduler:3001
      ENABLE_MANUAL_RUN: "true"
      ACCESS_CODE: your-password
    depends_on: [db]

  scheduler:
    build: .
    command: ["bun", "scheduler"]
    ports: ["3001:3001"]
    environment:
      DATABASE_URL: postgres://postgres:password@db:5432/pongo
      SCHEDULER_PORT: "3001"
      SCHEDULER_MAX_CONCURRENCY: "20"
      SCHEDULER_MAX_RETRIES: "5"
      SCHEDULER_RETRY_DELAY_MS: "5000"
      PONGO_REGION: us-east
    depends_on: [db]

  db:
    image: postgres:16
    environment:
      POSTGRES_PASSWORD: password
      POSTGRES_DB: pongo
    volumes:
      - pgdata:/var/lib/postgresql/data

volumes:
  pgdata:

Monitoring the Scheduler

Health Checks

Regularly check scheduler health:

curl http://scheduler:3001/health

Integrate with your container orchestration health checks:

docker-compose.yml

scheduler:
  # ...
  healthcheck:
    test: ["CMD", "curl", "-f", "http://localhost:3001/health"]
    interval: 30s
    timeout: 10s
    retries: 3

Monitor State

Check individual monitor state:

curl http://scheduler:3001/monitors/api-health

{
  "id": "api-health",
  "name": "API Health",
  "lastCheck": "2025-12-08T10:00:00Z",
  "lastStatus": "up",
  "consecutiveFailures": 0,
  "consecutiveSuccesses": 42,
  "nextCheck": "2025-12-08T10:01:00Z"
}

Troubleshooting

Scheduler Won’t Start

Database Connection: Verify DATABASE_URL is correct
```
bun scheduler  # Check error output
```
Port Conflict: Ensure port 3001 (or your custom SCHEDULER_PORT) is available
```
lsof -i :3001
```
Monitor Configuration: Check for syntax errors in pongo/monitors/
```
bun run lint
```

Monitors Not Running

Check Monitor Registration: Ensure monitors are exported in pongo/monitors/index.ts
Verify Schedule: Check monitor interval or cron expression syntax
Review Logs: Look for execution errors in scheduler output

High Latency

Increase Concurrency: Raise SCHEDULER_MAX_CONCURRENCY if you have many monitors
Database Performance: Check database connection pool and query performance
Monitor Timeouts: Ensure monitor timeout values are appropriate

Failed Checks

Network Access: Verify scheduler can reach monitored endpoints
Retry Configuration: Adjust SCHEDULER_MAX_RETRIES and SCHEDULER_RETRY_DELAY_MS
Monitor Handler Errors: Review monitor handler code for bugs

Database Sharing: All scheduler instances must share the same database. This ensures consistent state across regions.

Gradual Rollout: When deploying multi-region schedulers, start with one region and verify it’s working before adding more regions.

Deployment Options

Configuration

Scheduler Setup

Overview

Running the Scheduler

Standalone Process

Docker

Fly.io

HTTP API

Endpoints

Health Check Example

Manual Trigger Example

Environment Variables

Core Configuration

Concurrency and Performance

Retry Configuration

Multi-Region Setup

Region Identifier

Region-Aware Alerts

Complete Configuration Example

Development (Single Region)

Production (Multi-Region)

Docker Compose (Single Region)

Monitoring the Scheduler

Health Checks

Monitor State

Troubleshooting

Scheduler Won’t Start

Monitors Not Running

High Latency

Failed Checks

Build docs developers (and LLMs) love

Deployment Options

Configuration

​Overview

​Running the Scheduler

​Standalone Process

​Docker

​Fly.io

​HTTP API

​Endpoints

​Health Check Example

​Manual Trigger Example

​Environment Variables

​Core Configuration

​Concurrency and Performance

​Retry Configuration

​Multi-Region Setup

​Region Identifier

​Region-Aware Alerts

​Complete Configuration Example

​Development (Single Region)

​Production (Multi-Region)

​Docker Compose (Single Region)

​Monitoring the Scheduler

​Health Checks

​Monitor State

​Troubleshooting

​Scheduler Won’t Start

​Monitors Not Running

​High Latency

​Failed Checks

Build docs developers (and LLMs) love

Overview

Running the Scheduler

Standalone Process

Docker

Fly.io

HTTP API

Endpoints

Health Check Example

Manual Trigger Example

Environment Variables

Core Configuration

Concurrency and Performance

Retry Configuration

Multi-Region Setup

Region Identifier

Region-Aware Alerts

Complete Configuration Example

Development (Single Region)

Production (Multi-Region)

Docker Compose (Single Region)

Monitoring the Scheduler

Health Checks

Monitor State

Troubleshooting

Scheduler Won’t Start

Monitors Not Running

High Latency

Failed Checks