Skip to main content

Requirements

Crawlith requires Node.js 20.0.0 or higher. Check your version:
node --version
If you need to upgrade, visit nodejs.org to download the latest version.

Package Manager Installation

Install Crawlith globally using your preferred package manager:
npm install -g @crawlith/cli
The package @crawlith/cli includes the full CLI interface with all commands: crawl, page, ui, probe, sites, and clean.
Verify the installation:
crawlith --version
You should see output like:
0.1.1

Installing from Source

For development or contributing to Crawlith, you can build from source.
1

Clone the repository

git clone https://github.com/your-org/crawlith.git
cd crawlith
2

Install dependencies

Crawlith uses npm workspaces for its monorepo structure:
npm install
3

Build all packages

npm run build
This compiles TypeScript for @crawlith/core, @crawlith/cli, and @crawlith/web.
4

Link the CLI globally

npm link ./plugins/cli
Now you can run crawlith from anywhere on your system.
During development, you can run the CLI directly without linking using:
npm run crawlith -- <command>
For example: npm run crawlith -- crawl https://example.com

Data Storage

Crawlith automatically creates a local SQLite database on first run:
~/.crawlith/crawlith.db
This database stores:
  • Sites: Tracked domains and their metadata
  • Snapshots: Timestamped crawl sessions with configuration
  • Pages: Normalized URLs, HTML content, HTTP status, and SEO metadata
  • Edges: Internal link relationships between pages
  • Metrics: Calculated scores, PageRank, HITS, and health indicators
Do not delete ~/.crawlith/crawlith.db unless you want to lose all crawl history. Crawlith uses this database for incremental crawling and snapshot comparisons.

Verifying Installation

Run Crawlith without arguments to see the welcome banner and available commands:
crawlith
You should see:
  ██████╗██████╗  █████╗ ██╗    ██╗██╗     ██╗████████╗██╗  ██╗ 0.1.1
 ██╔════╝██╔══██╗██╔══██╗██║    ██║██║     ██║╚══██╔══╝██║  ██║
 ██║     ██████╔╝███████║██║ █╗ ██║██║     ██║   ██║   ███████║
 ██║     ██╔══██╗██╔══██║██║███╗██║██║     ██║   ██║   ██╔══██║
 ╚██████╗██║  ██║██║  ██║╚███╔███╔╝███████╗██║   ██║   ██║  ██║
  ╚═════╝╚═╝  ╚═╝╚═╝  ╚═╝ ╚══╝╚══╝ ╚══════╝╚═╝   ╚═╝   ╚═╝  ╚═╝

Crawlith — Deterministic crawl intelligence.

Usage: crawlith [options] [command]

Options:
  -V, --version      output the version number
  -h, --help         display help for command

Commands:
  crawl [options] [url]   Crawl an entire website and build its internal link graph
  page [options] [url]    Analyze a single URL for on-page SEO signals
  ui                      Launch the interactive web dashboard
  probe [options] [url]   Run infrastructure audit (TLS, DNS, Security)
  sites                   List all tracked sites in the database
  clean                   Clean up old snapshots and optimize database
  help [command]          display help for command

Next Steps

Quickstart

Ready to crawl your first site? Follow the quickstart guide.

Build docs developers (and LLMs) love