Skip to main content

Overview

This guide covers proven strategies for monitoring website health, optimising performance, and maintaining high availability with Adapt.

Scheduling Strategy

Choose the Right Interval

Adapt supports recurring crawls at 6, 12, 24, or 48-hour intervals:
Best for:
  • High-traffic production sites
  • E-commerce platforms
  • Sites with frequent content updates
Considerations:
  • Uses more daily page quota
  • Provides rapid issue detection
  • Keeps cache consistently warm
Best for:
  • Business websites
  • Marketing sites with regular updates
  • SaaS application landing pages
Considerations:
  • Balanced quota usage
  • Twice-daily health checks
  • Good cache coverage
Best for:
  • Corporate websites
  • Portfolio sites
  • Documentation sites
  • Blogs
Considerations:
  • Efficient quota usage
  • Daily health monitoring
  • Standard recommendation
Best for:
  • Low-traffic sites
  • Archive sites
  • Development environments
Considerations:
  • Minimal quota usage
  • Less frequent updates
  • Lower cache coverage
Start with 24-hour intervals and adjust based on your site’s update frequency and traffic patterns.

Cache Warming Strategy

When to Warm Cache

  1. After Publishing: Run a crawl immediately after deploying new content
  2. Before Traffic Spikes: Warm cache before expected high-traffic events
  3. After Cache Purges: Re-warm after manual cache clearing
  4. Regular Maintenance: Schedule recurring crawls to keep cache fresh

Priority-Based Warming

Connect Google Analytics to enable priority-based cache warming:
1

Connect Analytics

Link your Google Analytics property in Organisation Settings.
2

Automatic Prioritisation

Adapt automatically prioritises high-traffic pages when warming cache.
3

Verify Coverage

Check job results to confirm your most-visited pages show cache HITs.
Without Analytics, Adapt warms the homepage first, then processes pages in discovery order.

Crawl Configuration

Use both methods for comprehensive coverage:
{
  "domain": "example.com",
  "options": {
    "use_sitemap": true,
    "find_links": true,
    "max_pages": 0,
    "concurrency": 20
  }
}
MethodProsCons
SitemapFast, comprehensive, respects your structureRequires sitemap.xml
Link CrawlingFinds unlisted pages, validates internal linksSlower, may miss isolated pages
Enable both sitemap and link crawling to find all pages and validate all internal links.

Setting Max Pages

Set max_pages: 0 (unlimited) for complete site coverage:
{
  "max_pages": 0
}
Use a limit for initial testing:
{
  "max_pages": 50
}
Consider multiple focused crawls:
// Crawl 1: Homepage and main sections
{
  "domain": "example.com",
  "max_pages": 500
}

// Crawl 2: Blog archive
{
  "domain": "example.com",
  "max_pages": 500
}

Concurrency Settings

Adjust concurrency based on your hosting:
Hosting TypeRecommended Concurrency
Shared hosting5-10
VPS / Cloud20-30
Dedicated server30-50
CDN (Cloudflare, etc.)50-100
Higher concurrency speeds up crawls but increases server load. Start conservative and increase if your server handles it well.

Monitoring & Alerts

Set Up Slack Notifications

1

Install Slack App

Install the Adapt Slack app from your workspace settings.
2

Authorise Notifications

Grant permission for Adapt to send you direct messages.
3

Automatic Alerts

Receive notifications when:
  • Jobs complete
  • Jobs fail
  • Broken links are detected
  • Performance degrades

Monitor Usage Limits

Check usage regularly to avoid hitting limits:
curl https://adapt.app.goodnative.co/v1/usage \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"
Set a calendar reminder to review usage weekly, or build automated alerts using the API.

Multi-Organisation Workflows

Organise by Client or Project

Create one organisation per client:
  • Client A → Organisation “Client A”
  • Client B → Organisation “Client B”
  • Internal → Organisation “Agency Internal”
Benefits:
  • Isolated data and limits
  • Easy client handoff
  • Clear billing separation
Create organisations by environment or brand:
  • Production sites → Organisation “Production”
  • Staging sites → Organisation “Staging”
  • Partner sites → Organisation “Partners”
Benefits:
  • Environment isolation
  • Separate quota pools
  • Different team access

Team Management

Always have at least 2 admins per organisation:
curl -X PATCH https://adapt.app.goodnative.co/v1/organisations/members/user_456 \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -d '{"role": "admin"}'
Grant “member” role to team members who need to view results but not manage settings.

Performance Optimisation

Identify Bottlenecks

1

Run Baseline Crawl

Create a job to establish baseline performance metrics.
2

Review Slow Pages

Export slow pages report and analyse common patterns:
  • Large images
  • Slow database queries
  • Third-party scripts
  • Cache misses
3

Implement Fixes

Address issues starting with highest-traffic pages.
4

Verify Improvements

Run another crawl and compare response times.

Cache Optimisation

Maximise cache hit ratio:
  1. Enable Cache Warming: Run crawls after publishing
  2. Monitor Hit Ratio: Aim for 80%+ cache hits
  3. Fix Cache Misses: Investigate pages with consistent MISS status
  4. Warm High-Traffic Pages: Prioritise pages with most visitors
Pages showing DYNAMIC cache status are expected — these are typically authenticated pages, search results, or personalised content.

Weekly Review Process

1

Export Broken Links

Download the broken links report from your latest crawl.
2

Categorise Issues

Group broken links by:
  • Internal vs external
  • High-traffic vs low-traffic
  • Critical vs non-critical
3

Fix High-Priority Issues

Address broken links on high-traffic pages first.
4

Update Links

Fix broken internal links by updating references.
5

Verify Fixes

Run another crawl to confirm all links resolve.

Proactive Prevention

Before deleting pages, search your content for internal links to that page and update or remove them.
Schedule quarterly reviews of external links, as third-party sites change frequently.

API Integration Patterns

CI/CD Integration

Trigger crawls automatically after deployments:
# GitHub Actions example
- name: Trigger Adapt Crawl
  run: |
    curl -X POST https://adapt.app.goodnative.co/v1/jobs \
      -H "Authorization: Bearer ${{ secrets.ADAPT_API_KEY }}" \
      -H "Content-Type: application/json" \
      -d '{
        "domain": "example.com",
        "options": {"use_sitemap": true, "find_links": true}
      }'

Automated Reporting

Build custom reports using the API:
#!/bin/bash
# Weekly broken links report

JOB_ID=$(curl -s https://adapt.app.goodnative.co/v1/jobs?limit=1 \
  -H "Authorization: Bearer $TOKEN" | jq -r '.data.jobs[0].id')

curl "https://adapt.app.goodnative.co/v1/jobs/$JOB_ID/export?type=broken-links" \
  -H "Authorization: Bearer $TOKEN" > broken-links-report.json

# Send to Slack, email, or dashboard

Monitoring Scripts

Track performance trends:
import requests
import json
from datetime import datetime

def get_job_metrics(api_key):
    headers = {"Authorization": f"Bearer {api_key}"}
    response = requests.get(
        "https://adapt.app.goodnative.co/v1/jobs?limit=10",
        headers=headers
    )
    
    jobs = response.json()["data"]["jobs"]
    for job in jobs:
        print(f"Job {job['id']}: {job['progress']['percentage']}% complete")
        print(f"Avg response time: {job.get('stats', {}).get('avg_response_time')}ms")

get_job_metrics("your_api_key_here")

Security Best Practices

Create scoped API keys instead of using JWT tokens in automation:
curl -X POST https://adapt.app.goodnative.co/v1/auth/api-keys \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -d '{"name": "CI/CD Pipeline", "scopes": ["jobs:read", "jobs:create"]}'
Rotate API keys quarterly and immediately after team member departures.
Use the “member” role for users who only need to view results, reserving “admin” for users who manage settings and billing.

Troubleshooting Common Issues

Solutions:
  • Reduce concurrency to respect server rate limits
  • Set a max_pages limit for testing
  • Check if robots.txt specifies a high crawl-delay
  • Verify your hosting can handle the load
Solutions:
  • Run crawls more frequently to keep cache warm
  • Check CDN settings for cache TTL
  • Verify pages are cacheable (not authenticated)
  • Review cache-control headers
Solutions:
  • Check sitemap.xml for outdated URLs
  • Review recent content deletions
  • Validate internal link updates
  • Check for broken external links
Solutions:
  • Reduce crawl frequency
  • Set max_pages limits
  • Upgrade to a higher plan
  • Stagger crawls across multiple days

Build docs developers (and LLMs) love