createWorkerPool
Creates a Piscina worker pool for parallel processing.Maximum number of worker threads to create
Configured Piscina instance
Configuration
The worker pool is configured with:- filename:
threading/chunk-worker.mjs- The worker script that processes chunks - minThreads: Set to
threadsparameter - Pool maintains this many threads - maxThreads: Set to
threadsparameter - Pool won’t exceed this limit - idleTimeout:
Infinity- Workers stay alive for the entire generation lifecycle
Worker Script Location
The worker script is resolved at import time usingimport.meta.resolve('./chunk-worker.mjs'), ensuring the correct path regardless of where doc-kit is installed.
createParallelWorker
Creates a parallel worker that distributes work across a Piscina thread pool.Name of the generator (e.g.,
'metadata', 'ast-js', 'headings')Piscina worker pool instance from
createWorkerPoolConfiguration object containing:
threads- Number of threads (used for optimization decisions)chunkSize- Maximum items per chunk[generatorName]- Generator-specific configuration
Worker object with
stream method for parallel processingstream Method
Processes items in parallel, yielding results as chunks complete.Items to process (subset of fullInput)
Full input array for context (generators may need to reference other items)
Extra options passed to the generator’s
processChunk methodYields chunk results as they complete (not in input order)
How Parallel Processing Works
1. Chunk Creation
Items are split into chunks based onchunkSize (parallel.mjs:15-25):
chunkSize: 25, this creates 4 chunks: [0-24], [25-49], [50-74], [75-99].
2. Task Distribution
Each chunk becomes a task sent to the worker pool (parallel.mjs:101-124):
3. Result Collection
Results are yielded as they complete usingPromise.race (parallel.mjs:126-141):
4. Optimization for Small Workloads
The worker avoids thread overhead when:threads <= 1(single-threaded mode)items.length <= 2(too few items to benefit from parallelism)
processChunk method directly.
Usage in Generator Pipeline
From the main generator orchestration (generators.mjs:70-108):
Performance Considerations
Thread Count
More threads ≠ always faster. Optimal thread count depends on:- CPU cores: Match or slightly exceed physical core count
- Chunk size: Larger chunks reduce overhead but limit parallelism
- Generator complexity: CPU-intensive generators benefit more from parallelism
Serialization Overhead
Each task sent to a worker is serialized (structured clone). To minimize overhead:- Only the relevant chunk items are sent (not the full input)
- Only the generator’s specific config is included
- Indices are remapped to 0-based for the chunk
createTask (parallel.mjs:37-56):
Chunk Size Selection
Smaller chunks = more parallelism but more overheadLarger chunks = less overhead but less parallelism Default
chunkSize: 25 balances these tradeoffs for typical documentation workloads.