Supabase Database Setup
Supabase provides a managed PostgreSQL database for storing crawl site metadata and scheduling information.Create Supabase Project
Navigate to Supabase Dashboard
Go to app.supabase.com and sign in.
Create New Project
- Click “New Project”
- Select your organization (or create one)
- Configure project settings:
- Name:
llmstxt-generator - Database Password: Generate strong password (save securely)
- Region: Choose closest to your AWS region
- Pricing Plan: Free tier (sufficient for most use cases)
- Name:
- Click “Create new project”
Retrieve Supabase Credentials
Once your project is active:Copy Project URL
Under “Project URL”:
This is your
SUPABASE_URL - save it for Terraform configuration.Run Database Migrations
Create thecrawl_sites table for tracking crawled websites.
Successfully created
crawl_sites table!Database Schema Overview
Thecrawl_sites table structure:
| Column | Type | Description |
|---|---|---|
id | UUID | Primary key (auto-generated) |
base_url | TEXT | Website URL (unique) |
recrawl_interval_minutes | INTEGER | Recrawl frequency in minutes |
max_pages | INTEGER | Maximum pages to crawl (default: 50) |
desc_length | INTEGER | Max description length (default: 500) |
last_crawled_at | TIMESTAMPTZ | Last crawl timestamp |
latest_llms_hash | TEXT | SHA-256 hash of llms.txt content |
latest_llms_url | TEXT | Public URL of generated llms.txt |
created_at | TIMESTAMPTZ | Record creation time |
updated_at | TIMESTAMPTZ | Last update time |
The
idx_crawl_sites_due index optimizes queries for finding sites that need recrawling based on last_crawled_at and recrawl_interval_minutes.Cloudflare R2 Storage Setup
Cloudflare R2 provides S3-compatible object storage for hosting generatedllms.txt files.
Create R2 Bucket
Navigate to R2 Dashboard
Log in to dash.cloudflare.com → Select R2 from sidebar.
Purchase R2 (if needed)
If first time using R2:
- Click “Purchase R2”
- Review pricing (free tier: 10GB storage, 1M Class A ops/month)
- Click “Get Started”
Create Bucket
- Click “Create bucket”
- Bucket name:
llmstxt(or your preferred name) - Location: Automatic (recommended)
- Click “Create bucket”
Generate R2 API Token
Create API Token
- Click “Create API token”
- Configure token:
- Token name:
llmstxt-backend - Permissions: Read & Write
- Bucket: Select your bucket (
llmstxt) or “Apply to all buckets” - TTL: Leave empty (no expiration)
- Token name:
- Click “Create API Token”
Get Public R2 Domain
Optional: Configure Custom Domain
Use your own domain for R2 bucket
Use your own domain for R2 bucket
Add Custom Domain
In bucket settings → Public access → Connect Custom Domain:
- Enter your domain:
files.yourdomain.com - Cloudflare will provide DNS records to add
Update DNS Records
If domain is on Cloudflare:
- Records are added automatically
- Add CNAME record:
files.yourdomain.com→pub-xxxxx.r2.dev
Test Storage Configuration
Upload Test File to R2
Verify R2 credentials work correctly:If you can access the file via the public URL, R2 is configured correctly!
Test Supabase Connection
Verify Supabase credentials with a simple query:Credentials Summary
Before proceeding to Terraform, ensure you have all these values:Supabase Configuration
Supabase Configuration
Cloudflare R2 Configuration
Cloudflare R2 Configuration
Data Privacy Considerations
- Generated
llms.txtfiles are intentionally public (that’s their purpose) - Database records in Supabase are private (protected by row-level security)
- URLs in R2 are not guessable (contain random hashes)
- Consider implementing URL signing for additional security
Next Steps
Terraform Configuration
Configure Terraform variables and deploy AWS infrastructure