Skip to main content

Overview

The ingestion API triggers TechCal’s automated event collection pipeline. It fetches events from configured sources (RSS feeds, APIs, ICS calendars, HTML scrapers), normalizes data, and queues events for moderation.
Admin-only endpoint. Requires is_admin=true in user profile.

Authentication

Requires both:
  1. Valid Supabase authentication token
  2. Admin user privileges (profiles.is_admin = true)
Authorization: Bearer <supabase_access_token>

Request

HTTP Method

POST /api/admin/ingestion/run

Headers

Content-Type
string
required
Must be application/json

Body Parameters

sourceId
string
UUID of specific ingestion source to run. If omitted, runs all active sources.
limit
number
default:"100"
Maximum number of events to normalize after ingestion

Response

Success Response (200)

success
boolean
required
Always true for successful ingestion runs
message
string
required
Human-readable status message
summary
object
required
timestamp
string
required
ISO 8601 timestamp of ingestion completion

Error Responses

{
  "error": "Unauthorized"
}

Examples

Run All Sources

curl -X POST https://kurecal.app/api/admin/ingestion/run \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <admin_token>" \
  -d '{}'

Run Specific Source

curl -X POST https://kurecal.app/api/admin/ingestion/run \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <admin_token>" \
  -d '{
    "sourceId": "550e8400-e29b-41d4-a716-446655440000",
    "limit": 50
  }'

Ingestion Pipeline

Pipeline Stages

  1. Source Selection
    • If sourceId provided: Fetch single source
    • Otherwise: Fetch all sources where is_active=true
  2. Interval Check
    • Skip sources fetched within their fetch_interval_minutes
    • Manual triggers bypass this check (force fetch)
  3. Data Collection
    • RSS: Parse XML feed
    • API: Call REST endpoint
    • ICS: Parse iCalendar file
    • HTML: Scrape event listings
  4. Deduplication
    • Check for existing events by URL
    • Use fuzzy matching for title/date similarity
    • Skip duplicates, queue new events
  5. Normalization
    • Parse dates to ISO 8601
    • Extract location and format
    • Normalize tags and categories
    • Calculate quality score
  6. Quality Control
    • Auto-publish: Score ≥ 75% → Publish immediately
    • Moderation queue: Score < 75% → Requires review
    • Auto-reject: Score < 50% → Flag for manual review

Quality Scoring

Events receive a quality score (0-100) based on:
FactorWeightCriteria
Title Quality20%Length, clarity, no spam patterns
Description Quality20%Length, formatting, detail level
Date Validity15%Valid dates, reasonable duration
Location Data15%Complete location or virtual indicator
Speaker Info10%Speaker names and URLs present
Organizer Info10%Reputable organizer
Image Quality5%High-res image present
Tag Relevance5%Relevant tech tags

Supported Source Types

Configuration:
{
  "type": "rss",
  "url": "https://example.com/events.rss",
  "fetch_interval_minutes": 60
}
Extraction:
  • Title: <title>
  • Description: <description>
  • URL: <link>
  • Date: <pubDate> or <dc:date>

Cron Setup

Vercel Cron Configuration

Configured in vercel.json:
{
  "crons": [
    {
      "path": "/api/admin/ingestion/cron",
      "schedule": "0 * * * *"
    }
  ]
}

Cron Authentication

The cron endpoint requires a secret header:
Authorization: Bearer <CRON_SECRET>
Generate secret:
openssl rand -hex 32
Set in environment:
CRON_SECRET=your_generated_secret

Manual vs Cron Execution

AspectManual (/run)Cron (/cron)
AuthAdmin user tokenCRON_SECRET header
Interval checkBypassed (force)Enforced
Error handlingReturns 500Logs to Sentry
Timeout5 minutes60 seconds

Moderation Workflow

Moderation Queue

Events requiring review:
SELECT * FROM ingestion_queue
WHERE status = 'pending'
  AND quality_score < 75
ORDER BY quality_score DESC, created_at DESC;

Admin Dashboard

Manage queued events at /admin/ingestion/moderation:
  • Approve: Publish event immediately
  • Reject: Mark as spam/duplicate
  • Edit & Approve: Fix issues then publish
  • Bulk Actions: Approve/reject multiple events

Auto-Moderation Rules

Certain events are auto-moderated:
if (event.quality_score >= 75 && event.organizer_reputation === 'verified') {
  await publishEvent(event);
} else if (event.quality_score < 50) {
  await rejectEvent(event, 'Low quality score');
} else {
  await queueForModeration(event);
}

Troubleshooting

Common Issues

“Service role credentials not configured”
  • Set SUPABASE_SERVICE_ROLE_KEY environment variable
  • Restart application after setting
“Admin access required”
  • Run: UPDATE profiles SET is_admin = TRUE WHERE id = '<your_user_id>'
  • Get user ID from Supabase Dashboard → Authentication → Users
“Source fetch interval not elapsed”
  • Manual triggers bypass this automatically
  • Reduce fetch_interval_minutes in source config
“Normalization failed: Invalid date format”
  • Check source date format matches expected pattern
  • Add custom date parser in normalization config

Debug Mode

Enable detailed logging:
NEXT_PUBLIC_LOG_LEVEL=debug
Logs include:
  • Source fetch URLs and responses
  • Deduplication match details
  • Normalization field mappings
  • Quality score calculations

Build docs developers (and LLMs) love