@nodejs/doc-kit transform API documentation through a pipeline. Each generator takes input from a previous generator (or raw files), processes it, and yields output for the next generator or final output.
Generator Pipeline
The generator system works as a pipeline where each generator depends on the output of a previous generator. Here’s the complete pipeline:Blue generators are internal (used as dependencies only). Green generators are public (can be invoked via CLI).
Pipeline Stages
The pipeline consists of four main stages:Parse to AST
The
Depends on: None (processes raw files)
Parallel processing: Yes
ast generator parses raw Markdown files into Abstract Syntax Trees (MDAST). This is the first stage that converts unstructured text into a structured format.Generator: astDepends on: None (processes raw files)
Parallel processing: Yes
Extract Metadata
The
Depends on:
Parallel processing: Yes
metadata generator extracts structured metadata from the AST, creating a flattened list of API documentation entries with type information, headings, and relationships.Generator: metadataDepends on:
astParallel processing: Yes
Convert to JSX AST
The
Depends on:
Parallel processing: Yes
jsx-ast generator converts the metadata and MDAST into JSX Abstract Syntax Trees, preparing the content for web rendering.Generator: jsx-astDepends on:
metadataParallel processing: Yes
How Dependencies Work
Each generator declares its dependency using thedependsOn field. The framework automatically:
- Constructs the pipeline based on dependencies
- Executes generators in order to satisfy dependencies
- Caches output so each generator runs only once
- Enables parallel consumption when multiple generators depend on the same output
Multiple Consumers
Multiple generators can depend on the same generator. For example, themetadata generator is consumed by:
jsx-ast- For web renderingjson-simple- For JSON outputman-page- For man page generationorama-db- For search indexingllms-txt- For LLM consumption
metadata runs only once and its output is cached for all consumers.
Generator Types
Internal Generators
Internal generators are used only as dependencies and are not exposed via CLI:ast- Parses Markdown to MDASTmetadata- Extracts structured metadatajsx-ast- Converts to JSX ASTast-js- Parses JavaScript files
Public Generators
Public generators can be invoked directly via the CLI:web- Generates HTML/CSS/JS bundlesjson-simple- Generates simple JSONman-page- Generates Unix man pagesorama-db- Generates search databaselegacy-html- Legacy HTML formatlegacy-json- Legacy JSON formataddon-verify- Verifies addon documentationapi-links- Generates API link databasellms-txt- Generates LLM-optimized textsitemap- Generates sitemap
Streaming vs. Batch Processing
Streaming Generators
Streaming generators yield results as they’re produced using async generators. This enables:- Reduced memory usage - Process data in chunks
- Earlier downstream starts - Next generator can begin before this one finishes
- Better parallelism - Multiple generators work simultaneously
hasParallelProcessor: true.
Batch Generators
Some generators must collect all input before processing. For example, theweb generator needs all entries together to generate code-split bundles.
Use batch processing when:
- You need all data to make decisions (e.g., code splitting, global analysis)
- Output format requires complete dataset
- Cross-references between items need resolution
Next Steps
Built-in Generators
Explore all available generators and their purposes
Creating Custom Generators
Learn how to create your own generators
Parallel Processing
Implement worker-based parallel processing