Introduction
The@crawlith/core package provides a programmatic API for crawling websites and analyzing their structure. Use it to build custom crawlers, integrate SEO analysis into your workflow, or perform automated audits.
Installation
Install the core library using your preferred package manager:Quick Start
Here’s a basic example of crawling a website and analyzing its structure:Core Concepts
Crawling
The crawler discovers pages by following links, respecting robots.txt, and building a graph of your site’s structure. Each crawl creates a snapshot stored in a SQLite database.Graph Model
Crawlith represents your website as a directed graph:- Nodes represent pages (URLs)
- Edges represent links between pages
Metrics
After crawling, run post-crawl metrics to calculate:- PageRank scores
- HITS algorithm (authority/hub scores)
- Orphan pages and near-orphans
- Deep pages and crawl efficiency
- Duplicate detection
Basic Usage Example
Complete workflow with crawling and metrics:Event-Driven Crawling
Monitor crawl progress in real-time using event context:TypeScript Support
The library is written in TypeScript and includes full type definitions. All interfaces and types are exported for your use:Next Steps
Crawler API
Learn about crawl options and the Crawler class
Graph API
Work with the graph structure and analysis
Metrics API
Calculate and analyze site metrics
Audit API
Perform security and performance audits