Skip to main content

Overview

This guide walks you through creating your first crawl job in Adapt. You’ll learn how to:
  1. Create an account and authenticate
  2. Set up your first domain
  3. Create and run a crawl job
  4. View results and identify issues
Adapt integrates with Supabase Auth for authentication, supporting email/password and social login (Google, GitHub).

Prerequisites

Before you begin, ensure you have:
  • A website URL you want to monitor
  • An email address for account creation
  • (Optional) A sitemap.xml file for faster discovery

Step 1: Create Your Account

1

Visit the Adapt Application

Navigate to https://adapt.app.goodnative.co in your browser.
2

Sign Up

Click “Sign Up” and choose your authentication method:
  • Email/Password: Enter your email and create a password
  • Social Login: Use Google or GitHub for quick signup
3

Verify Your Email

Check your inbox for a verification email from Adapt and click the confirmation link.
4

Complete Your Profile

After verification, you’ll be redirected to the welcome page where you can set up your organisation.

Step 2: Create Your First Job

Once logged in, you’ll see the dashboard. Let’s create your first crawl job.
1

Navigate to Jobs

From the dashboard, click “New Job” or navigate to the jobs section.
2

Enter Domain Details

{
  "domain": "example.com",
  "options": {
    "use_sitemap": true,
    "find_links": true,
    "max_pages": 100,
    "concurrency": 20
  }
}
Configuration Options:
  • use_sitemap: Automatically discover URLs from sitemap.xml
  • find_links: Crawl discovered pages for additional links
  • max_pages: Maximum number of pages to crawl (0 = unlimited)
  • concurrency: Number of parallel requests (default: 20)
3

Optional: Configure Path Filters

Add include/exclude patterns to control which paths are crawled:
{
  "include_paths": "/blog/*,/products/*",
  "exclude_paths": "/admin/*,/draft/*"
}
4

Start the Job

Click “Create Job” to start the crawl. The job will move from createdrunningcompleted.

Step 3: Monitor Job Progress

Adapt provides real-time updates on your crawl job via WebSockets.
The dashboard updates automatically as tasks are processed. No need to refresh!

Using the API

You can also monitor progress programmatically:
curl -X GET "https://adapt.app.goodnative.co/v1/jobs/{job_id}" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"
Response:
{
  "status": "success",
  "data": {
    "id": "job_123abc",
    "domain": "example.com",
    "status": "running",
    "progress": {
      "total_tasks": 150,
      "completed_tasks": 45,
      "failed_tasks": 2,
      "skipped_tasks": 0,
      "percentage": 31.33
    },
    "stats": {
      "avg_response_time": 234,
      "cache_hit_ratio": 0.85,
      "total_bytes": 2048576
    }
  }
}

Step 4: View Results

Once the job completes, you can view detailed results and identify issues.

Results Summary

Get a high-level summary of your crawl:
curl -X GET "https://adapt.app.goodnative.co/v1/jobs/{job_id}/results" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"
Key Metrics:
  • Total pages crawled and success rate
  • Average response time and performance breakdown
  • Cache hit ratio and warming effectiveness
  • Issues detected: 404s, slow pages, server errors, redirects

Filtering Tasks

View specific subsets of tasks using query parameters:
# View all failed tasks
curl -X GET "https://adapt.app.goodnative.co/v1/jobs/{job_id}/tasks?status=failed" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

# View 404 errors
curl -X GET "https://adapt.app.goodnative.co/v1/jobs/{job_id}/tasks?status_code=404" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

# View slow pages (over 5 seconds)
curl -X GET "https://adapt.app.goodnative.co/v1/jobs/{job_id}/tasks?min_response_time=5000" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Export Results

Export full results for analysis:
# Export as CSV
curl -X GET "https://adapt.app.goodnative.co/v1/jobs/{job_id}/export?format=csv" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -o results.csv

# Export as JSON
curl -X GET "https://adapt.app.goodnative.co/v1/jobs/{job_id}/export?format=json" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -o results.json

Step 5: Set Up Scheduled Crawls

Automate recurring crawls with schedulers:
curl -X POST "https://adapt.app.goodnative.co/v1/schedulers" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "domain": "example.com",
    "schedule_interval_hours": 24,
    "concurrency": 20,
    "find_links": true,
    "max_pages": 0,
    "include_paths": "/blog/*,/products/*",
    "exclude_paths": "/admin/*"
  }'
Interval Options:
  • 6 hours - Frequent monitoring
  • 12 hours - Twice daily
  • 24 hours - Daily (recommended)
  • 48 hours - Every other day
Set max_pages: 0 for unlimited crawling of your entire site.

Common Issues and Solutions

Job Stuck in “Created” Status

Ensure at least one worker is running. Jobs require workers to process tasks.
Check worker status in your organisation settings. If no workers are available, contact support.

High Failed Task Count

Common causes:
  • 404 errors: Pages no longer exist or URLs are incorrect
  • Timeout issues: Pages taking too long to respond (>30s)
  • Rate limiting: Your server is blocking requests (adjust concurrency)
  • SSL errors: Certificate issues with HTTPS
Review failed tasks:
curl -X GET "https://adapt.app.goodnative.co/v1/jobs/{job_id}/tasks?status=failed" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Cache MISS on All Pages

If cache warming isn’t working:
  1. Verify your CDN configuration supports cache warming
  2. Check if cache headers are set correctly on your origin server
  3. Ensure cache keys are consistent (no random query parameters)

Next Steps

API Reference

Explore all available API endpoints

Webflow Integration

Connect Adapt to your Webflow sites

Slack Notifications

Get notified when jobs complete or fail

Local Development

Set up Adapt for local development

Build docs developers (and LLMs) love