Skip to main content

Overview

Schedulers enable automatic recurring job execution at specified intervals (6, 12, 24, or 48 hours). Each scheduler continuously monitors a domain and creates jobs at regular intervals.

Create Scheduler

POST /v1/schedulers
Authorization: Bearer <token>
Content-Type: application/json

{
  "domain": "example.com",
  "schedule_interval_hours": 24,
  "concurrency": 20,
  "find_links": true,
  "max_pages": 0,
  "include_paths": ["/blog/*", "/products/*"],
  "exclude_paths": ["/admin/*"],
  "required_workers": 1
}

Request Body

domain
string
required
The domain to crawl (e.g., “example.com”)
schedule_interval_hours
integer
required
Interval between job runs. Must be one of: 6, 12, 24, 48
concurrency
integer
default:20
Number of concurrent requests (max 100)
Whether to discover pages by crawling internal links
max_pages
integer
default:0
Maximum number of pages to crawl per job (0 = unlimited)
include_paths
array
Array of path patterns to include (supports wildcards, e.g., “/blog/*”)
exclude_paths
array
Array of path patterns to exclude (supports wildcards, e.g., “/admin/*”)
is_enabled
boolean
default:true
Whether the scheduler is active

Response Fields

id
string
Unique scheduler identifier
domain
string
The domain being monitored
schedule_interval_hours
integer
Hours between job runs
next_run_at
string
ISO 8601 timestamp of next scheduled job
is_enabled
boolean
Whether the scheduler is currently active
concurrency
integer
Number of concurrent requests for scheduled jobs
Whether to discover pages by crawling links
max_pages
integer
Maximum pages per job (0 = unlimited)
include_paths
array
Path patterns to include
exclude_paths
array
Path patterns to exclude
created_at
string
ISO 8601 timestamp of scheduler creation
updated_at
string
ISO 8601 timestamp of last update

List Schedulers

GET /v1/schedulers
Authorization: Bearer <token>

Response Fields

data
array
Array of scheduler objects for the organisation

Get Scheduler

GET /v1/schedulers/{scheduler_id}
Authorization: Bearer <token>

Path Parameters

scheduler_id
string
required
Unique scheduler identifier (UUID format)

Update Scheduler

PUT /v1/schedulers/{scheduler_id}
Authorization: Bearer <token>
Content-Type: application/json

{
  "schedule_interval_hours": 12,
  "is_enabled": false
}

Path Parameters

scheduler_id
string
required
Unique scheduler identifier

Request Body

All fields are optional. Only provided fields will be updated.
schedule_interval_hours
integer
Update interval. Must be one of: 6, 12, 24, 48
concurrency
integer
Number of concurrent requests (max 100)
Whether to discover pages by crawling links
max_pages
integer
Maximum pages per job (0 = unlimited, cannot be negative)
include_paths
array
Path patterns to include (pass null to clear)
exclude_paths
array
Path patterns to exclude (pass null to clear)
is_enabled
boolean
Enable or disable the scheduler
expected_is_enabled
boolean
Optional optimistic concurrency control. If provided, update will fail with 409 Conflict if current is_enabled state doesn’t match this value.
The domain cannot be changed after scheduler creation. To monitor a different domain, delete this scheduler and create a new one.

Delete Scheduler

DELETE /v1/schedulers/{scheduler_id}
Authorization: Bearer <token>

Path Parameters

scheduler_id
string
required
Unique scheduler identifier
Deleting a scheduler does not delete jobs that have already been created. Historical jobs remain accessible.

List Jobs for Scheduler

GET /v1/schedulers/{scheduler_id}/jobs?limit=10&offset=0
Authorization: Bearer <token>

Path Parameters

scheduler_id
string
required
Unique scheduler identifier

Query Parameters

limit
integer
default:10
Results per page (max 100)
offset
integer
default:0
Number of results to skip

Response Fields

jobs
array
Array of job objects created by this scheduler
jobs[].scheduler_id
string
Parent scheduler identifier
jobs[].source_type
string
Will be “scheduler” for all jobs from this endpoint
pagination
object
Pagination metadata

Use Cases

Monitor E-Commerce Site Daily

{
  "domain": "shop.example.com",
  "schedule_interval_hours": 24,
  "include_paths": ["/products/*"],
  "exclude_paths": ["/checkout/*", "/account/*"]
}

High-Frequency Blog Monitoring

{
  "domain": "blog.example.com",
  "schedule_interval_hours": 6,
  "find_links": true,
  "max_pages": 500
}

Weekly Full Site Crawl

{
  "domain": "example.com",
  "schedule_interval_hours": 48,
  "concurrency": 50,
  "max_pages": 0
}

Pause Scheduler Temporarily

PUT /v1/schedulers/{scheduler_id}

{
  "is_enabled": false
}

Build docs developers (and LLMs) love