docbot index

The docbot index command scans your documentation and codebase, creates embeddings, and stores them in Qdrant for fast semantic search. It tracks file changes and only re-indexes modified files.

Basic usage

docbot index --docs ./docs

This command will:

Connect to your Qdrant instance
Scan documentation files for changes
Create embeddings for new or modified files
Update the embedding manifest with file hashes
Optionally index codebase files if --codebase is provided

The AI_GATEWAY_API_KEY environment variable is required for indexing. Docbot uses this to generate embeddings through your AI gateway.

Options

--docs

string

Path to the documentation directory. Can also be set via paths.docs in your config file.If both are omitted, the command will fail with an error.

--codebase

string

Comma-separated paths or globs to codebase directories. Falls back to paths.codebase in your config file.Examples: apps/helm,packages/* or src/**If omitted, only documentation will be indexed.

--config

string

Path to docbot config file. Defaults to docbot.config.jsonc in your project root.Alias: -c

--qdrant-url

string

Qdrant server URL. Overrides the URL in your config file.Default: http://127.0.0.1:6333

--force

boolean

default:false

Force full re-index, ignoring the manifest. This re-creates embeddings for all files even if they haven’t changed.Alias: -f

Examples

docbot index --docs ./docs

How indexing works

Incremental indexing

Docbot maintains an embedding manifest at .docbot/manifest.json that tracks file hashes. On each run:

Scans files - Walks the documentation and codebase directories
Computes hashes - Calculates content hashes for each file
Detects changes - Compares hashes against the manifest
Syncs embeddings - Only processes files that are new, changed, or removed

This makes subsequent indexing runs very fast.

File categorization

During scanning, files are categorized as:

Added - New files not in the manifest
Changed - Files with different content hashes
Removed - Files in manifest but no longer on disk
Unchanged - Files with matching hashes (skipped)

Chunking strategy

Docbot splits files into chunks for better embedding quality:

Documentation - Split by headings and semantic boundaries
Code - Split by function/class definitions and logical blocks

Each chunk is embedded separately and stored with metadata (path, section, line numbers).

Expected output

First-time indexing

initializing docbot...
  docs: /path/to/docs
  project: my-project
  codebase paths:
    - /path/to/src
connecting to qdrant...
scanning documentation...
  docs: 45 new, 0 changed, 0 removed, 0 unchanged (scanned in 0.8s)
syncing documentation embeddings...
  ✓ synced 342 chunks
scanning codebase...
  code: 128 new, 0 changed, 0 removed, 0 unchanged (scanned in 1.2s)
syncing codebase embeddings...
  ✓ synced 867 chunks
done

Incremental update

initializing docbot...
  docs: /path/to/docs
  project: my-project
connecting to qdrant...
scanning documentation...
  docs: 2 new, 3 changed, 1 removed, 39 unchanged (scanned in 0.3s)
syncing documentation embeddings...
  ✓ synced 18 chunks
scanning codebase...
  code: 127 files unchanged (scanned in 0.4s)
done

No changes detected

initializing docbot...
  docs: /path/to/docs
  project: my-project
connecting to qdrant...
scanning documentation...
  docs: 45 files unchanged (scanned in 0.2s)
scanning codebase...
  code: 128 files unchanged (scanned in 0.3s)
done

Run docbot index regularly to keep embeddings up to date. It’s fast on subsequent runs thanks to incremental indexing.

Configuration

You can configure paths in docbot.config.jsonc to avoid passing flags:

{
  "projectSlug": "my-project",
  "paths": {
    "docs": "./docs",
    "codebase": ["./src", "./apps/*"]
  },
  "qdrant": {
    "url": "http://127.0.0.1:6333",
    "collections": {
      "docs": "my-project-docs",
      "code": "my-project-code"
    }
  }
}

Then simply run:

docbot index

Monitoring progress

The indexing process shows real-time progress:

Scanning phase - Shows file counts and scan duration
Syncing phase - Shows chunk counts as embeddings are created
Manifest saves - Periodically saves progress (automatic on SIGINT/SIGTERM)

If you interrupt indexing with Ctrl+C, the manifest is saved automatically. Rerunning the command will resume from where it left off.

Performance considerations

Large codebases

For projects with thousands of files:

Use specific paths - Instead of indexing the entire repo, target specific directories:
```
docbot index --docs ./docs --codebase "./src,./lib"
```
Exclude build artifacts - Ensure your codebase paths don’t include node_modules, dist, or other generated directories
First run takes time - Initial indexing can take several minutes for large projects. Subsequent runs are much faster.

Embedding costs

Each chunk generates an API call to create embeddings. To minimize costs:

Use incremental indexing (don’t use --force unnecessarily)
Be selective with codebase paths
Focus on documentation and source directories only

Troubleshooting

Error: AI_GATEWAY_API_KEY environment variable is required

error: AI_GATEWAY_API_KEY environment variable is required

Set your API key:

export AI_GATEWAY_API_KEY="your-api-key"
docbot index --docs ./docs

Error: docs path is required

error: docs path is required (provide --docs or set paths.docs in config)

Either pass --docs or configure paths.docs in your config file.

Connection errors

If Qdrant isn’t accessible:

connecting to qdrant...
Error: connect ECONNREFUSED 127.0.0.1:6333

Verify Qdrant is running:

curl http://127.0.0.1:6333/health

Or start it with Docker:

docker run -d --name docbot-qdrant -p 6333:6333 -v $(pwd)/.docbot/qdrant_storage:/qdrant/storage qdrant/qdrant

Slow indexing

If indexing is slower than expected:

Check network latency - Embedding API calls depend on network speed
Verify file counts - Ensure you’re not accidentally indexing node_modules or other large directories
Monitor Qdrant - Check Qdrant logs for performance issues

Next steps

After indexing:

Test search - Use docbot search "query" to verify embeddings
Run tasks - Start the agent with docbot run "task"
Keep updated - Re-run indexing when documentation changes

See the Search command for querying your indexed documentation.

Get Started

Core Concepts

Commands

Configuration

Guides

Basic usage

Options

Examples

How indexing works

Incremental indexing

File categorization

Chunking strategy

Expected output

First-time indexing

Incremental update

No changes detected

Configuration

Monitoring progress

Performance considerations

Large codebases

Embedding costs

Troubleshooting

Error: AI_GATEWAY_API_KEY environment variable is required

Error: docs path is required

Connection errors

Slow indexing

Next steps

Build docs developers (and LLMs) love

Get Started

Core Concepts

Commands

Configuration

Guides

​Basic usage

​Options

​Examples

​How indexing works

​Incremental indexing

​File categorization

​Chunking strategy

​Expected output

​First-time indexing

​Incremental update

​No changes detected

​Configuration

​Monitoring progress

​Performance considerations

​Large codebases

​Embedding costs

​Troubleshooting

​Error: AI_GATEWAY_API_KEY environment variable is required

​Error: docs path is required

​Connection errors

​Slow indexing

​Next steps

Build docs developers (and LLMs) love

Basic usage

Options

Examples

How indexing works

Incremental indexing

File categorization

Chunking strategy

Expected output

First-time indexing

Incremental update

No changes detected

Configuration

Monitoring progress

Performance considerations

Large codebases

Embedding costs

Troubleshooting

Error: AI_GATEWAY_API_KEY environment variable is required

Error: docs path is required

Connection errors

Slow indexing

Next steps