Skip to main content

Overview

The Schedule API enables configuration of automated surveillance scraping using CRON expressions. Each scope (national/external) maintains its own independent schedule. Base Path: /api/v1/surveillance

Endpoints

Get Schedule

Retrieve the current scraping schedule for a scope.
GET /api/v1/surveillance/schedule?scope={scope}
Parameters:
ParameterTypeRequiredDescription
scopestringNonational or external (default: national)
Response:
{
  "cron": "0 9 * * MON",
  "next_run_iso": "2024-03-11T14:00:00Z"
}
Schema: ScheduleResponse (surveillance.py:24-26)
FieldTypeDescription
cronstring | nullCRON expression (null if disabled)
next_run_isostring | nullNext scheduled run (ISO 8601 UTC)
Example:
curl -X GET "https://api.vigia.app/api/v1/surveillance/schedule?scope=external" \
  -H "Authorization: Bearer <token>"

Set Schedule

Configure or update the scraping schedule for a scope.
POST /api/v1/surveillance/schedule?scope={scope}
Parameters:
ParameterTypeRequiredDescription
scopestringNonational or external (default: national)
Request Body:
{
  "cron": "0 9 * * MON,THU",
  "next_run_iso": "2024-03-07T14:00:00Z"
}
Schema: ScheduleSet (surveillance.py:20-22)
FieldTypeRequiredDescription
cronstring | nullNoCRON expression (null to disable)
next_run_isostring | nullNoOverride next run time (ISO 8601)
If next_run_iso is not provided and a valid cron is set, the system automatically calculates the next run time based on the CRON expression.
Response:
{
  "cron": "0 9 * * MON,THU",
  "next_run_iso": "2024-03-07T14:00:00Z"
}
Example:
curl -X POST "https://api.vigia.app/api/v1/surveillance/schedule?scope=external" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "cron": "0 9 * * MON,THU",
    "next_run_iso": "2024-03-07T14:00:00Z"
  }'

CRON Expressions

Format

VIGIA uses standard CRON format with 5 fields:
┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of month (1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌───────────── day of week (0 - 6) (Sunday=0 or 7)
│ │ │ │ │
│ │ │ │ │
* * * * *

Common Patterns

CRON ExpressionDescriptionUse Case
0 9 * * MONEvery Monday at 9:00 AMWeekly national monitoring
0 9 * * MON,THUMonday and Thursday at 9:00 AMBi-weekly external monitoring
0 */6 * * *Every 6 hoursHigh-frequency monitoring
0 9 1 * *First day of month at 9:00 AMMonthly regulatory review
30 8 * * 1-5Weekdays at 8:30 AMDaily business day monitoring
0 0 * * SUNEvery Sunday at midnightWeekend batch processing
0 9,15 * * *Daily at 9:00 AM and 3:00 PMTwice-daily checks

Examples

{
  "cron": "0 9 * * MON",
  "next_run_iso": null  // Auto-calculated
}
Execution: Every Monday at 9:00 AM (America/Lima time)Use Case: Monitor DIGEMID alerts weekly
{
  "cron": "0 9 * * MON,THU",
  "next_run_iso": "2024-03-07T14:00:00Z"
}
Execution: Monday and Thursday at 9:00 AM (America/Lima time)Use Case: Monitor FDA, EMA twice per week
{
  "cron": "0 */4 * * *",
  "next_run_iso": null
}
Execution: Every 4 hoursUse Case: Critical product surveillance during safety crisis
{
  "cron": "0 9 1 * *",
  "next_run_iso": "2024-04-01T14:00:00Z"
}
Execution: First day of each month at 9:00 AMUse Case: Monthly summary for IPS report preparation
{
  "cron": null,
  "next_run_iso": null
}
Execution: Disabled (manual only)Use Case: Switch to manual-only scraping

Timezone Handling

Critical: All CRON schedules execute in America/Lima timezone (UTC-5).

Execution Time

  • CRON Expression: Evaluated in America/Lima (UTC-5)
  • next_run_iso: Stored and returned in UTC
  • API Response: Always ISO 8601 with UTC timezone
Example:
{
  "cron": "0 9 * * MON",         // 9:00 AM Lima time
  "next_run_iso": "2024-03-11T14:00:00Z"  // 14:00 UTC (9:00 Lima)
}

Setting next_run_iso

You can provide next_run_iso in any timezone—it will be converted to UTC:
// Lima time (UTC-5)
{
  "cron": "0 9 * * MON",
  "next_run_iso": "2024-03-11T09:00:00-05:00"
}

// Converted to UTC in response
{
  "cron": "0 9 * * MON",
  "next_run_iso": "2024-03-11T14:00:00Z"
}

Automatic Calculation

If next_run_iso is not provided or is null, the system automatically calculates it using croniter:
# Implementation: surveillance.py:207-213
if (not next_iso) and s.cron:
    now_lima = datetime.now(ZoneInfo("America/Lima"))
    it = croniter(s.cron, now_lima)
    nxt = it.get_next(datetime)
    next_iso = nxt.astimezone(timezone.utc).isoformat()
    s.next_run_iso = next_iso
    db.commit()
When to Override:
  • Delay execution: Push next run to a future date
  • Immediate run: Set to current time for urgent scraping
  • Skip execution: Set to far future date to temporarily disable
Example - Delay by One Week:
from datetime import datetime, timedelta, timezone

next_week = datetime.now(timezone.utc) + timedelta(days=7)

payload = {
    "cron": "0 9 * * MON",
    "next_run_iso": next_week.isoformat()
}

Scheduler Job

Scheduled jobs execute via APScheduler background worker: Implementation: backend/app/jobs/surveillance_scheduler.py

Execution Flow

1

Trigger Time

APScheduler triggers job when next_run_iso is reached
2

Scrape Sources

System scrapes all enabled sources for the scope
3

Store Results

New surveillance items are inserted into database
4

Calculate Next Run

next_run_iso is updated based on CRON expression
5

Log Execution

last_run_at timestamp is recorded in SurveillanceSchedule model

Debug Scheduler Jobs

GET /api/v1/surveillance/scheduler/jobs?scope=external
Response:
{
  "scheduler": "ok",
  "scope": "external",
  "jobs": {
    "surveillance_external": {
      "trigger": "cron[day_of_week='mon,thu', hour='9', minute='0']",
      "next_run_time": "2024-03-07T14:00:00+00:00"
    }
  }
}
Implementation: surveillance_schedule.py:31-45

Manual Execution

Trigger immediate scraping without affecting the schedule:
POST /api/v1/surveillance/run?scope={scope}
Response:
{
  "ok": true,
  "results": {
    "source-uuid-1": {
      "items_found": 15,
      "items_new": 3,
      "source": "FDA Drug Safety"
    },
    "source-uuid-2": {
      "items_found": 8,
      "items_new": 0,
      "source": "EMA Safety Updates"
    }
  }
}
Manual execution via /run does not update next_run_iso or last_run_at. It runs independently of the schedule.

Best Practices

Frequency Recommendations

National Scope

Recommended: Weekly (Monday)DIGEMID updates are less frequent. Weekly monitoring is sufficient for most cases.
{"cron": "0 9 * * MON"}

External Scope

Recommended: Bi-weekly (Monday, Thursday)FDA/EMA publish frequent updates. Monitor 2-3 times per week.
{"cron": "0 9 * * MON,THU"}

Avoid Over-Scraping

Do not set high-frequency scraping (e.g., hourly) for external sources:
  • Increases risk of IP blocking by regulatory websites
  • Generates duplicate data (most sources update daily)
  • Wastes server resources
  • May violate terms of service
Exception: During safety crisis or product recall, temporary high-frequency monitoring (every 4-6 hours) may be justified.

Schedule Coordination

Coordinate schedules with downstream processes:
// Scrape Monday at 9 AM
{"cron": "0 9 * * MON"}

// Send reports Monday at 10 AM (after scraping completes)
// Configure in /send-reports recipient list

Testing New Schedules

  1. Test CRON Expression:
    from croniter import croniter
    from datetime import datetime
    from zoneinfo import ZoneInfo
    
    now = datetime.now(ZoneInfo("America/Lima"))
    cron = croniter("0 9 * * MON", now)
    
    # Get next 5 executions
    for _ in range(5):
        print(cron.get_next(datetime))
    
  2. Set future next_run_iso:
    {
      "cron": "0 9 * * MON",
      "next_run_iso": "2024-04-01T14:00:00Z"  // First run in future
    }
    
  3. Monitor with /scheduler/jobs:
    curl -X GET "https://api.vigia.app/api/v1/surveillance/scheduler/jobs?scope=external" \
      -H "Authorization: Bearer <token>"
    

Error Handling

Invalid CRON Expression

// Request
{
  "cron": "0 25 * * MON"  // Invalid hour (25)
}

// Response: 400 Bad Request
{
  "detail": "Invalid CRON expression"
}

Missing Schedule

If no schedule exists for a scope, GET /schedule creates a default disabled schedule:
{
  "cron": null,
  "next_run_iso": null
}
Implementation: surveillance.py:202-205

Integration Examples

Scrape and Email Weekly Report

Combine scheduled scraping with automated reporting:
import requests
from datetime import datetime, timedelta, timezone

token = "<your_jwt_token>"
base_url = "https://api.vigia.app/api/v1/surveillance"

# 1. Configure weekly scraping (Monday 9 AM)
requests.post(
    f"{base_url}/schedule",
    params={"scope": "external"},
    headers={"Authorization": f"Bearer {token}"},
    json={"cron": "0 9 * * MON"}
)

# 2. After scraping completes, send report
# (Schedule this 1 hour after scraping)
last_monday = datetime.now(timezone.utc) - timedelta(days=7)
this_monday = datetime.now(timezone.utc)

requests.post(
    f"{base_url}/send-reports",
    params={"scope": "external"},
    headers={
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    },
    json={
        "recipients": ["[email protected]"],
        "formats": {"inline": True, "pdf": True},
        "filters": {
            "date_from": last_monday.isoformat(),
            "date_to": this_monday.isoformat()
        }
    }
)

Dynamic Schedule Adjustment

Adjust frequency based on surveillance volume:
import requests

def adjust_schedule_based_on_alerts(token, scope="external"):
    """Increase scraping frequency if high-severity alerts detected."""
    base_url = "https://api.vigia.app/api/v1/surveillance"
    
    # Check recent high-severity alerts
    response = requests.get(
        f"{base_url}/results",
        params={
            "scope": scope,
            "severity": "Alta",
            "limit": 10
        },
        headers={"Authorization": f"Bearer {token}"}
    )
    
    results = response.json()
    alert_count = results["total"]
    
    if alert_count > 5:
        # High alert volume: increase to daily
        new_cron = "0 9 * * *"
        print("High alert volume detected. Increasing to daily scraping.")
    else:
        # Normal volume: bi-weekly
        new_cron = "0 9 * * MON,THU"
        print("Normal alert volume. Setting bi-weekly scraping.")
    
    # Update schedule
    requests.post(
        f"{base_url}/schedule",
        params={"scope": scope},
        headers={"Authorization": f"Bearer {token}"},
        json={"cron": new_cron}
    )

Data Sources

Configure which sources to scrape

Results

Query scraped surveillance items

Reports

Generate and email reports

Code References

ComponentFile Location
Schedule Endpointsbackend/app/routers/surveillance.py:197-226
Schedule Schemabackend/app/schemas/surveillance.py:20-26
Scheduler Jobbackend/app/jobs/surveillance_scheduler.py
CRON ParserUses croniter library
Alternate Scheduler Routerbackend/app/routers/surveillance_schedule.py:47-80

Build docs developers (and LLMs) love