Adapt uses PostgreSQL as its primary database with a normalised schema designed for efficiency and data integrity. The system leverages PostgreSQL-specific features like FOR UPDATE SKIP LOCKED for lock-free concurrent task processing.
As of July 26, 2025, we manage database schema via migrations. Never run supabase db push manually - let GitHub integration handle it.
// Located in internal/db/db.goclient.SetMaxOpenConns(45) // Maximum open connectionsclient.SetMaxIdleConns(18) // Maximum idle connectionsclient.SetConnMaxLifetime(5 * time.Minute) // Connection lifetimeclient.SetConnMaxIdleTime(2 * time.Minute) // Idle connection timeout
Stores unique domain names with integer primary keys for normalisation.
CREATE TABLE domains ( id SERIAL PRIMARY KEY, name VARCHAR(255) UNIQUE NOT NULL, created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP);CREATE INDEX idx_domains_name ON domains(name);
Stores page paths with domain references to reduce redundancy.
CREATE TABLE pages ( id SERIAL PRIMARY KEY, domain_id INTEGER REFERENCES domains(id), path TEXT NOT NULL, created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, UNIQUE(domain_id, path));CREATE INDEX idx_pages_domain_path ON pages(domain_id, path);
CREATE TABLE jobs ( id TEXT PRIMARY KEY, domain_id INTEGER REFERENCES domains(id), user_id TEXT, organisation_id TEXT, status TEXT NOT NULL, progress REAL DEFAULT 0.0, total_tasks INTEGER DEFAULT 0, completed_tasks INTEGER DEFAULT 0, failed_tasks INTEGER DEFAULT 0, skipped_tasks INTEGER DEFAULT 0, found_tasks INTEGER DEFAULT 0, sitemap_tasks INTEGER DEFAULT 0, created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, started_at TIMESTAMP WITH TIME ZONE, completed_at TIMESTAMP WITH TIME ZONE, concurrency INTEGER DEFAULT 1, find_links BOOLEAN DEFAULT FALSE, max_pages INTEGER DEFAULT 100, include_paths TEXT, exclude_paths TEXT, required_workers INTEGER DEFAULT 1);-- Indexes for performanceCREATE INDEX idx_jobs_status ON jobs(status);CREATE INDEX idx_jobs_user_org ON jobs(user_id, organisation_id);CREATE INDEX idx_jobs_created_at ON jobs(created_at);
Status Values:
pending - Job created but not started
running - Job actively processing tasks
completed - All tasks finished successfully
cancelled - Job manually cancelled
failed - Job failed due to system error
Task Counters:
total_tasks = sitemap_tasks + found_tasks
sitemap_tasks - URLs from sitemap processing
found_tasks - URLs discovered through link crawling
completed_tasks - Successfully processed URLs
failed_tasks - URLs that failed processing
skipped_tasks - URLs skipped due to limits or filters
CREATE TABLE tasks ( id TEXT PRIMARY KEY, job_id TEXT REFERENCES jobs(id), domain_id INTEGER REFERENCES domains(id), page_id INTEGER REFERENCES pages(id), status TEXT NOT NULL, source_type TEXT, source_url TEXT, created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, started_at TIMESTAMP WITH TIME ZONE, completed_at TIMESTAMP WITH TIME ZONE, status_code INTEGER, response_time INTEGER, cache_status TEXT, content_type TEXT, error TEXT, retry_count INTEGER DEFAULT 0);-- Indexes for task processingCREATE INDEX idx_tasks_status ON tasks(status);CREATE INDEX idx_tasks_job_id ON tasks(job_id);CREATE INDEX idx_tasks_job_status ON tasks(job_id, status);CREATE INDEX idx_tasks_pending ON tasks(created_at) WHERE status = 'pending';
Extends Supabase auth.users with application-specific data.
CREATE TABLE users ( id TEXT PRIMARY KEY, -- Matches Supabase auth.users(id) email TEXT NOT NULL UNIQUE, full_name TEXT, organisation_id TEXT REFERENCES organisations(id), created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP);CREATE INDEX idx_users_email ON users(email);CREATE INDEX idx_users_organisation ON users(organisation_id);
CREATE TABLE organisations ( id TEXT PRIMARY KEY DEFAULT gen_random_uuid()::text, name TEXT NOT NULL, created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP);
Atomic progress updates with conditional status changes:
-- Job progress update (internal/db/queue.go)UPDATE jobsSET progress = CASE WHEN total_tasks > 0 THEN ((completed_tasks + failed_tasks)::REAL / total_tasks::REAL) * 100.0 ELSE 0.0 END, completed_tasks = ( SELECT COUNT(*) FROM tasks WHERE job_id = $1 AND status = 'completed' ), failed_tasks = ( SELECT COUNT(*) FROM tasks WHERE job_id = $1 AND status = 'failed' )WHERE id = $1;
Indexes:idx_jobs_org_status_created and idx_jobs_org_created
-- For queries WITH status filterCREATE INDEX CONCURRENTLY idx_jobs_org_status_createdON jobs(organisation_id, status, created_at DESC);-- For queries WITHOUT status filterCREATE INDEX CONCURRENTLY idx_jobs_org_createdON jobs(organisation_id, created_at DESC);
Why Two Indexes:
PostgreSQL can’t always use multi-column indexes when middle columns are omitted
Each index is only ~100KB but provides 90%+ improvement (11ms → <1ms)
Eliminated sequential scans that were reading 5899 buffers for 164 rows
This creates a timestamped file in supabase/migrations/
2
Write migration SQL
-- Add new columns safelyALTER TABLE jobsADD COLUMN IF NOT EXISTS new_field TEXT DEFAULT '';-- Create indexes concurrently (non-blocking)CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_new_fieldON jobs(new_field);-- Remove old columns after confirming unusedALTER TABLE jobs DROP COLUMN IF EXISTS old_field;
3
Test locally (optional)
supabase startsupabase db reset # Applies all migrations
4
Deploy via GitHub
Push to feature branch
Create PR to test-branch (migrations auto-apply)
After testing, merge to main (migrations auto-apply)
-- Enable RLS on user tablesALTER TABLE users ENABLE ROW LEVEL SECURITY;ALTER TABLE jobs ENABLE ROW LEVEL SECURITY;ALTER TABLE tasks ENABLE ROW LEVEL SECURITY;-- Users can only access their own dataCREATE POLICY "users_own_data" ON usersFOR ALL USING (auth.uid()::text = id);-- Organisation members can access shared jobsCREATE POLICY "org_jobs_access" ON jobsFOR ALL USING ( organisation_id IN ( SELECT organisation_id FROM users WHERE id = auth.uid()::text ));-- Tasks inherit job access permissionsCREATE POLICY "job_tasks_access" ON tasksFOR ALL USING ( job_id IN ( SELECT id FROM jobs WHERE organisation_id IN ( SELECT organisation_id FROM users WHERE id = auth.uid()::text ) ));
Adapt ships with the pg_stat_statements extension enabled:
The migration 20251012070000_enable_pg_stat_statements.sql enables the extension
View at observability.pg_stat_statements_top_total_time lists top 50 statements by total execution time
Query it via the Supabase SQL Editor using a service key
Recommended workflow:
-- Review top queriesSELECT * FROM observability.pg_stat_statements_top_total_time LIMIT 20;-- Optional: Reset statistics after exporting for analysisSELECT pg_stat_statements_reset();
# Restore from backuppsql -h host -U user -d new_database < backup_file.sql# Verify data integritySELECT COUNT(*) FROM jobs;SELECT COUNT(*) FROM tasks;SELECT COUNT(*) FROM users;