How It Works
The generation pipeline consists of three core stages:1. Repository Cloning (Shallow Clone)
WhatDoc uses shallow cloning to minimize disk usage and maximize speed:2. Code Ingestion & Token Optimization
The engine walks through your repository and intelligently extracts source files while applying multiple optimization layers:Fat-Trimmer Blacklist
WhatDoc automatically filters out files that waste tokens without providing documentation value:Test files (
.test.js, .spec.ts) and minified bundles (.min.js, .bundle.js) are also automatically excluded.Regex Guillotine: Code Minification
Before sending code to the LLM, WhatDoc strips noise that doesn’t contribute to understanding:Context Window Management
WhatDoc concatenates all whitelisted files into a single payload with clear file boundaries:3. LLM Generation with Paradigm-Aware Prompting
The concatenated codebase is sent to Google Gemini 2.5 Flash with a highly specialized system prompt that:- Detects the repository paradigm (REST API, Frontend App, CLI Tool, SDK/Library)
- Adapts documentation style based on the detected type
- Generates two documents: a README and a TECHNICAL_REFERENCE
- Enforces strict markdown quality rules (syntax highlighting, proper heading hierarchy, GitHub-flavored alerts)
Adaptive Documentation Strategy
The AI automatically adjusts its output based on what it finds:For Backend/API Projects
For Backend/API Projects
Focuses on:
- Endpoint documentation (HTTP method, path, auth)
- Database models and schemas
- Authentication flows
- Request/response examples with real schemas
- Interactive API playground blocks (see API Playground)
For Frontend Projects
For Frontend Projects
Focuses on:
- Component architecture
- State management patterns
- Routing structure
- Props/hooks documentation
- UI component trees
For Libraries/SDKs
For Libraries/SDKs
Focuses on:
- Exported functions and classes
- Method signatures
- Usage examples
- Installation instructions
Retry Logic & Rate Limiting
WhatDoc includes exponential backoff for rate-limited requests:Bring Your Own Key (BYOK)
Pro users can connect their own Google Gemini API key to bypass rate limits and access higher-tier models:Supported Languages
WhatDoc can analyze and document projects in:- JavaScript, TypeScript, JSX, TSX
- Python
- Java, Kotlin, Scala
- C, C++, C#
- Go, Rust, Ruby, PHP, Swift
- Configuration files (JSON, YAML, Dockerfile, Makefile)
Real-Time Progress Streaming
The engine emits real-time events during generation:Performance Benchmarks
| Repository Size | Files Analyzed | Generation Time | Token Count |
|---|---|---|---|
| Small (< 50 files) | 42 | 12s | ~45k tokens |
| Medium (50-200 files) | 156 | 28s | ~120k tokens |
| Large (200+ files) | 287 | 45s | ~200k tokens |
Times are averages using Gemini 2.5 Flash. Pro models (Gemini 2.5 Pro) may take longer but produce higher-quality output.
Next Steps
Templates
Explore 14+ professional documentation templates
Live Editor
Edit generated docs with the rich markdown editor
