Configuration

The Docs MCP Server uses a unified configuration system that aggregates settings from multiple sources, validating them against a strict schema.

Configuration File

By default, configuration is stored in your system’s preferences directory:

~/Library/Preferences/docs-mcp-server/config.yaml

Example Config File

config.yaml

app:
  storePath: ~/.docs-mcp-server
  telemetryEnabled: true
  embeddingModel: text-embedding-3-small

scraper:
  maxPages: 1000
  maxDepth: 3
  document:
    maxSize: 10485760  # 10MB

splitter:
  preferredChunkSize: 1500
  maxChunkSize: 5000

The server automatically updates this file on startup with new defaults.

Using a Custom Config File

You can specify a custom config file with --config or DOCS_MCP_CONFIG:

docs-mcp-server --config /path/to/config.yaml

Explicit config files are treated as read-only. The server will not modify them.

Configuration Priority

Configuration values are merged from multiple sources, with later sources taking precedence:

Defaults (lowest priority)
Config File
Environment Variables
CLI Arguments (highest priority)

Environment Variables

Any configuration setting can be overridden via environment variables using the naming convention:

DOCS_MCP_<SECTION>_<SETTING>

Rules:

Convert camelCase to UPPER_SNAKE_CASE
Join nested paths with underscores

Common Environment Variables

# Override scraper settings
export DOCS_MCP_SCRAPER_MAX_PAGES=2000
export DOCS_MCP_SCRAPER_DOCUMENT_MAX_SIZE=52428800

Legacy Aliases

Some settings have legacy aliases for convenience:

Setting	Alias
`server.ports.default`	`PORT`
`server.host`	`HOST`

CLI Arguments

Common settings have dedicated CLI flags:

docs-mcp-server --port 8080 --host 0.0.0.0
docs-mcp-server --store-path /data/docs --read-only

CLI Configuration Commands

Manage configuration directly from the command line:

View Config
Get Value
Set Value

# View current configuration (JSON format)
docs-mcp-server config

# View current configuration (YAML format)
docs-mcp-server config --yaml

# Get a specific value
docs-mcp-server config get scraper.maxPages
# Output: 1000

# Get a nested object
docs-mcp-server config get scraper.fetcher
# Output: { "maxRetries": 6, ... }

# Set a value (persists to config file)
docs-mcp-server config set scraper.maxPages 500
# Output: Updated scraper.maxPages = 500

config set only modifies the system default configuration file. If you specify --config, the file is treated as read-only.

Configuration Reference

App Settings

General application settings.

app.storePath

string

default:"~/.docs-mcp-server"

Directory for storing databases and logs.

app.telemetryEnabled

boolean

default:"true"

Enable anonymous usage telemetry.

app.readOnly

boolean

default:"false"

Prevent modification of data (scraping/indexing).

app.embeddingModel

string

default:"text-embedding-3-small"

Model to use for vector embeddings. Format: provider:model_name or just model_name for OpenAI.Examples:

openai:text-embedding-3-small (default)
vertex:text-embedding-004 (Google Cloud Vertex AI)
gemini:gemini-embedding-exp-03-07 (Google Generative AI)
aws:amazon.titan-embed-text-v1
microsoft:text-embedding-ada-002

Server Settings

Settings for the API and MCP servers.

server.protocol

string

default:"auto"

Server protocol: stdio, http, or auto.

server.host

string

default:"127.0.0.1"

Host interface to bind to.

server.heartbeatMs

number

default:"30000"

MCP protocol heartbeat interval in milliseconds.

server.ports.default

number

default:"6280"

Default port for the main server.

server.ports.worker

number

default:"8080"

Port for the background worker service.

server.ports.mcp

number

default:"6280"

Port for the specific MCP interface.

server.ports.web

number

default:"6281"

Port for the web dashboard.

Authentication Settings

Security settings for the HTTP server.

auth.enabled

boolean

default:"false"

Enable JWT authentication.

auth.issuerUrl

string

OIDC Issuer URL (e.g., Clerk, Auth0).

auth.audience

string

Expected JWT audience claim.

Scraper Settings

Settings controlling the web scraping behavior.

scraper.maxPages

number

default:"1000"

Maximum number of pages to crawl per job.

scraper.maxDepth

number

default:"3"

Maximum link depth to traverse.

scraper.maxConcurrency

number

default:"3"

Number of concurrent page fetches.

scraper.pageTimeoutMs

number

default:"5000"

Timeout for a single page load in milliseconds.

scraper.browserTimeoutMs

number

default:"30000"

Timeout for the browser instance in milliseconds.

scraper.fetcher.maxRetries

number

default:"6"

Number of retries for failed requests.

scraper.fetcher.baseDelayMs

number

default:"1000"

Initial delay for exponential backoff in milliseconds.

scraper.document.maxSize

number

default:"10485760"

Maximum size in bytes for PDF/Office documents (10MB default).

Scraper settings are often overridden per-job via CLI arguments like --max-pages.

GitHub Authentication

Environment variables for authenticating with GitHub when scraping private repositories.

GITHUB_TOKEN

string

GitHub personal access token or fine-grained token. Used for private repo access and higher rate limits.

GH_TOKEN

string

Alternative to GITHUB_TOKEN. Used if GITHUB_TOKEN is not set.

Authentication Resolution Order:

Explicit Authorization header passed in scraper options
GITHUB_TOKEN environment variable
GH_TOKEN environment variable
Local gh CLI authentication (via gh auth token)

If no authentication is available, public repositories are still accessible but with lower rate limits (60 requests/hour vs 5,000 authenticated).

Splitter Settings

Settings for chunking text for vector search.

splitter.minChunkSize

number

default:"500"

Minimum characters per chunk body. Chunks below this threshold are merged with adjacent chunks by the greedy optimizer.

splitter.preferredChunkSize

number

default:"1500"

Soft target for chunk body size in characters. The greedy optimizer splits when combining two chunks would exceed this value.

splitter.maxChunkSize

number

default:"5000"

Hard upper limit for chunk body size in characters. No chunk body will exceed this value.

These size limits apply to the text body of each chunk. Before embedding, a small metadata header (page title, URL, section path) is prepended to each chunk. If your embedding model has a small context window, consider lowering maxChunkSize.

Embedding Settings

Settings for vector embedding generation.

embeddings.batchSize

number

default:"100"

Number of chunks to embed in one request.

embeddings.vectorDimension

number

default:"1536"

Dimension of the vector space (must match your embedding model).

Database Settings

Internal database settings.

db.migrationMaxRetries

number

default:"5"

Retries for database migrations on startup.

Assembly Settings

Settings for reassembling search results.

assembly.maxChunkDistance

number

default:"3"

Maximum sort_order difference to merge chunks.

assembly.maxParentChainDepth

number

default:"10"

Maximum depth for parent context traversal.

assembly.childLimit

number

default:"3"

Maximum number of child chunks to include.

assembly.precedingSiblingsLimit

number

default:"1"

Number of preceding sibling chunks to include.

assembly.subsequentSiblingsLimit

number

default:"2"

Number of subsequent sibling chunks to include.

Provider-Specific Configuration

For detailed configuration of embedding providers (OpenAI, Ollama, Gemini, Azure, AWS), see the Embedding Models guide.

Getting Started

Setup

Guides

Architecture

Infrastructure

Configuration File

Example Config File

Using a Custom Config File

Configuration Priority

Environment Variables

Common Environment Variables

Legacy Aliases

CLI Arguments

CLI Configuration Commands

Configuration Reference

App Settings

Server Settings

Authentication Settings

Scraper Settings

GitHub Authentication

Splitter Settings

Embedding Settings

Database Settings

Assembly Settings

Provider-Specific Configuration

Build docs developers (and LLMs) love

Getting Started

Setup

Guides

Architecture

Infrastructure

​Configuration File

​Example Config File

​Using a Custom Config File

​Configuration Priority

​Environment Variables

​Common Environment Variables

​Legacy Aliases

​CLI Arguments

​CLI Configuration Commands

​Configuration Reference

​App Settings

​Server Settings

​Authentication Settings

​Scraper Settings

​GitHub Authentication

​Splitter Settings

​Embedding Settings

​Database Settings

​Assembly Settings

​Provider-Specific Configuration

Build docs developers (and LLMs) love

Configuration File

Example Config File

Using a Custom Config File

Configuration Priority

Environment Variables

Common Environment Variables

Legacy Aliases

CLI Arguments

CLI Configuration Commands

Configuration Reference

App Settings

Server Settings

Authentication Settings

Scraper Settings

GitHub Authentication

Splitter Settings

Embedding Settings

Database Settings

Assembly Settings

Provider-Specific Configuration