System Overview
Adapt is a web cache warming service built in Go, designed for Webflow sites and other web applications. It uses a worker pool architecture for efficient URL crawling and cache warming, with a focus on reliability, performance, and observability.Core Components
Worker Pool System
The worker pool is the heart of Adapt’s concurrent processing system:- Concurrent Processing - Multiple workers process tasks simultaneously using PostgreSQL’s
FOR UPDATE SKIP LOCKED - Job Management - Jobs are broken down into individual URL tasks and distributed across workers
- Recovery System - Automatic recovery of stalled or failed tasks with exponential backoff
- Task Monitoring - Real-time monitoring of task progress and status
Database Layer (PostgreSQL)
Adapt uses PostgreSQL with Supabase:- Normalised Schema - Separate tables for domains, pages, jobs, and tasks to reduce redundancy
- Row-Level Locking - Uses
FOR UPDATE SKIP LOCKEDfor efficient concurrent task acquisition - Connection Pooling - Optimised pool settings (45 max open, 18 max idle connections)
- Data Integrity - Maintains job history, statistics, and task relationships
API Layer
RESTful API with standard practices:- RESTful Design -
/v1/*endpoints with standardised responses and error handling - Authentication - JWT-based auth with Supabase Auth integration
- Middleware Stack - CORS, logging, rate limiting, request tracking
- Request IDs - Every request tracked with unique identifier
Crawler System
Efficient URL crawling and cache validation:- Concurrent URL Processing - Configurable concurrency with rate limiting
- Cache Validation - Monitors cache status and performance metrics
- Response Tracking - Records response times, status codes, and cache hits
- Link Discovery - Optional extraction of additional URLs from crawled pages
Technical Concepts
Jobs and Tasks
Job
A collection of URLs from a single domain to be crawled
- Contains metadata: domain, user/organisation, concurrency settings
- Tracks progress: total/completed/failed task counts
- Has lifecycle: pending → running → completed/cancelled
Task
Individual URL processing unit within a job
- References a specific page within the job’s domain
- Tracks execution: status, timing, response metrics, errors
- Can be: pending → running → completed/failed/skipped
Worker
Process that executes tasks concurrently
- Claims tasks atomically using database locking
- Handles retries and error reporting
- Updates task and job progress
Job Lifecycle
Job Creation
- Validate domain and create domain/page records
- Insert job with pending status
- Optionally process sitemap or create root task
Job Start
- Update status to running
- Reset any stalled tasks from previous runs
- Add job to worker pool for processing
Task Processing
- Workers claim pending tasks atomically
- Crawl URLs with retry logic and rate limiting
- Store results and update task status
- Update job progress counters
Job Completion
- Automatic detection when all tasks finished
- Calculate final statistics
- Mark job as completed with timestamp
Codebase Structure
Architectural Principles
Adapt follows focused, testable function design:- Function Size - Functions kept under 50 lines where possible
- Single Responsibility - Each function has one clear purpose
- Testing - Strategic test coverage for critical paths and complex logic
- Extract + Test + Commit - Proven methodology for safe refactoring
Package Organisation
System Monitoring
Sentry Integration Strategy
Adapt uses Sentry for both error tracking and performance monitoring:- Job creation, start, and cancellation failures
- Worker startup failures and task status update failures
- Transaction failures and stuck job cleanup failures
- Database connection and server startup/shutdown failures
manager.create_job,manager.start_job,manager.cancel_job- Job operationsmanager.get_job,manager.get_job_status- Job queriesmanager.process_sitemap- Sitemap processingdb.cleanup_stuck_jobs,db.create_page_records- Database operations
Health Monitoring
- Database Health - Connection status and query performance
- Worker Status - Active worker count and task processing rates
- Job Progress - Real-time completion tracking and statistics
- API Performance - Request timing and error rates
Frontend Integration
Template + Data Binding System
Adapt uses a template-based approach with attribute-based event handling:- Users control all HTML structure and CSS styling
- No CSS conflicts with existing designs
- Works with any frontend framework (Webflow, custom sites)
- Lightweight JavaScript library (~50KB)
- Complete form handling with validation and authentication
- Real-time data binding with template engine
Security & Authentication
JWT Authentication
- Supabase Auth Integration - Validates JWT tokens from Supabase
- User Context - Extracts user and organisation IDs from tokens
- Protected Endpoints - Requires authentication for job operations
- Row Level Security - PostgreSQL RLS policies for data isolation
Rate Limiting
- IP-Based Limiting - Token bucket algorithm (5 requests/second default)
- Client IP Detection - Supports X-Forwarded-For headers for proxies
- Crawler Rate Limiting - Configurable delays between URL requests
- Concurrency Controls - Per-job worker limits
Request Security
- Input Validation - URL and parameter sanitisation
- Error Sanitisation - Prevents information leakage
- CORS Configuration - Controlled cross-origin access
- Request Tracking - Unique request IDs for audit trails
Deployment Architecture
Infrastructure
- Hosting - Fly.io with auto-scaling
- Database - PostgreSQL with connection pooling (Supabase)
- CDN - Cloudflare for caching and protection
- Monitoring - Sentry (errors), Grafana Cloud (traces), Codecov (coverage)
- Authentication - Supabase Auth with custom domain
- Real-time - Supabase Realtime for live job progress updates
- Storage - Supabase Storage (hot) + Cloudflare R2 (cold archive)
Data Storage Strategy
Hot Storage
Supabase Storage for recent and frequently accessed files:
- Temporary assets
- Crawler logs for active jobs
- Recent HTML page captures for debugging
- Fast, instant access for day-to-day operations
Cold Storage
Cloudflare R2 for long-term archival:
- Historical data older than 30-90 days
- Automated Go background job moves data from hot to cold
- Significantly lower storage costs with no egress fees
- Ideal for large volumes of infrequently accessed data
Performance Optimisation
Database Optimisations
- Connection Pooling - 45 max open, 18 max idle connections
- Query Optimisation - Indexed queries and efficient joins
- Batch Operations - Reduce individual database calls
- Lock-Free Task Claiming -
FOR UPDATE SKIP LOCKEDprevents contention
Crawler Optimisations
- Concurrent Processing - Multiple workers process URLs simultaneously
- Connection Reuse - HTTP client connection pooling
- Rate Limiting - Prevents overwhelming target servers
- Response Streaming - Efficient memory usage for large responses
Memory Management
- Resource Cleanup - Proper goroutine and connection cleanup
- Buffer Management - Controlled memory allocation
- Garbage Collection - Optimised for low-latency operations
Supabase Integration
Real-time Features
Uses Postgres Changes subscriptions via Supabase Realtime: Implemented:- ✅ Notification Badge - Real-time updates when jobs complete (v0.20.0)
- Postgres Changes subscription on
notificationstable - WebSocket CSP configured for
wss://adapt.auth.goodnative.co - 200ms query delay to avoid transaction visibility race condition
- Postgres Changes subscription on
- Live Job Progress - Postgres Changes on
jobstable for instant updates - Dashboard Stats - Real-time totals without page refresh
- Team Presence - Live indicators for multi-user organisations
Future Enhancements
- Database Functions (Stage 5) - Move CPU-intensive queries to PostgreSQL functions
- Edge Functions (Stage 6+) - Handle webhooks and scheduled jobs
- File Storage (Stage 5) - Store crawler logs and screenshots
- Enhanced RLS (Stage 6) - Replace Go auth middleware with database-level policies
Recent Refactoring Success
5 monster functions eliminated:getJobTasks: 216 → 56 lines (74% reduction)CreateJob: 232 → 42 lines (82% reduction)setupJobURLDiscovery: 108 → 17 lines (84% reduction)setupSchema: 216 → 27 lines (87% reduction)WarmURL: 377 → 68 lines (82% reduction)