Skip to main content
SDL-MCP’s semantic engine powers embedding-based symbol search and LLM-generated natural-language summaries. This guide walks through selecting a quality tier, configuring a summary provider, and verifying the setup.

Quality Tiers

The summary system has three quality tiers. Each builds on the previous one:
TierEmbedding ModelSummariesSearch QualityCost
Low (default)all-MiniLM-L6-v2 (384-dim)NoneBaselineFree
Mediumnomic-embed-text-v1.5 (768-dim)NoneBetterFree
HighEither modelLLM-generatedBestAPI cost
  • Low — embeds raw symbol text (name + kind + signature) with a general-purpose model. No API calls. The all-MiniLM-L6-v2 model is bundled with the npm package (~22 MB).
  • Medium — swaps in a higher-quality text embedding model with a longer context window (8,192 tokens vs 256). Fully offline. Downloads ~138 MB on first use.
  • High — adds LLM-generated natural-language summaries to either embedding model. Both models are text-based and benefit equally from summaries. This produces the best search results because the LLM distills code meaning into plain English that embedding models handle well.
Semantic search (semantic.enabled) is true by default with provider: "local" using the ONNX runtime. Keep generateSummaries disabled until you validate summary quality for your repository.

Upgrading from all-MiniLM to nomic-embed-text-v1.5

To move from Low to Medium tier, change the model and run a full re-index:
{
  "semantic": {
    "enabled": true,
    "provider": "local",
    "model": "nomic-embed-text-v1.5"
  }
}
Then run a full re-index (required because embedding dimensions change from 384 to 768):
npx sdl-mcp index --repo my-repo --mode full
On first use, nomic-embed-text-v1.5 is downloaded automatically from HuggingFace (~138 MB). To pre-download it:
node scripts/download-models.mjs nomic-embed-text-v1.5
The model is cached at:
PlatformPath
Windows%LOCALAPPDATA%\sdl-mcp\models\nomic-embed-text-v1.5\
macOS~/.cache/sdl-mcp/models/nomic-embed-text-v1.5/
Linux~/.cache/sdl-mcp/models/nomic-embed-text-v1.5/
CustomSet semantic.modelCacheDir in config

Summary Providers

Summary generation is independent from the embedding provider. You can use local embeddings with an API-based summary provider, or vice versa.
Uses Claude models via the Anthropic Messages API. Highest quality, no local GPU needed.Recommended models:
ModelSpeedQualityPricing
claude-haiku-4-5-20251001FastGood (default)0.25/0.25 / 1.25 per 1M tokens
claude-sonnet-4-20250514MediumHigher3/3 / 15 per 1M tokens
For most repositories, Haiku is the best balance of cost and quality. Each symbol uses roughly 50–150 input tokens.
1

Get an API key

Sign up at console.anthropic.com, go to API Keys, and create a new key (starts with sk-ant-).
2

Set the API key

Set via environment variable (recommended for shared configs):
# Linux / macOS
export ANTHROPIC_API_KEY="sk-ant-your-key-here"

# Windows (persist across terminals)
setx ANTHROPIC_API_KEY "sk-ant-your-key-here"
Or set inline in config (not recommended for shared configs):
{
  "semantic": {
    "summaryApiKey": "sk-ant-your-key-here"
  }
}
3

Configure

{
  "semantic": {
    "enabled": true,
    "provider": "local",
    "model": "all-MiniLM-L6-v2",
    "generateSummaries": true,
    "summaryProvider": "api",
    "summaryModel": "claude-haiku-4-5-20251001"
  }
}
Key resolution order: summaryApiKey in config → ANTHROPIC_API_KEY env var. If neither is set, summary generation is skipped and existing cached summaries are preserved.
4

Index your repository

npx sdl-mcp index --repo my-repo
The indexer reports summary stats at the end:
[summaries] Generated 312 summaries, 535 cached, 0 failed ($0.62)
Highest quality configuration (nomic embeddings + Anthropic summaries):
{
  "semantic": {
    "enabled": true,
    "provider": "local",
    "model": "nomic-embed-text-v1.5",
    "generateSummaries": true,
    "summaryProvider": "api",
    "summaryModel": "claude-haiku-4-5-20251001"
  }
}
Requires ANTHROPIC_API_KEY. Downloads ~138 MB embedding model on first run.

Tuning Batch Processing

Summary generation processes symbols in parallel batches. Adjust these settings based on your provider’s rate limits and hardware:
SettingDefaultRangeDescription
summaryBatchSize201–50Symbols processed per batch
summaryMaxConcurrency51–20Batches running in parallel
For Anthropic API — defaults are fine. Lower summaryMaxConcurrency to 3 if you hit rate limits on a free-tier key. For Ollama on CPU — set summaryMaxConcurrency: 1 and summaryBatchSize: 10 to avoid overwhelming your machine. For Ollama on GPU — defaults are fine. Increase summaryMaxConcurrency to 8–10 if your GPU has headroom.
{
  "semantic": {
    "generateSummaries": true,
    "summaryProvider": "local",
    "summaryModel": "qwen2.5-coder",
    "summaryMaxConcurrency": 1,
    "summaryBatchSize": 10
  }
}

Verifying Summaries After Indexing

After indexing with summaries enabled, verify that summaries appear in symbol cards:
sdl.symbol.search({ repoId: "my-repo", query: "handleRequest", limit: 1 })
sdl.symbol.getCard({ repoId: "my-repo", symbolId: "<id-from-search>" })
The card’s summary field should contain a natural-language description instead of a heuristic placeholder. The index output also reports summary stats:
[indexing] Extracted 847 symbols from 92 files
[summaries] Generated 312 summaries, 535 cached, 0 failed ($0.62)
[embeddings] Computed 847 embeddings (all-MiniLM-L6-v2)
Check summarySource on cards to see how each summary was produced:
summarySourcesummaryQualityMeaning
"jsdoc"1.0Extracted from JSDoc / doc comment
"llm"0.8LLM-generated (Claude Haiku, Ollama)
"nn-direct:<id>"0.6Transferred from a similar symbol (similarity ≥ 0.85)
"nn-adapted:<id>"0.5Adapted from a similar symbol (similarity 0.70–0.85)
"heuristic-typed"0.4Pattern-matched from name + param types
"heuristic-fallback"0.3Pattern-matched from name + kind only

Quick Reference: Copy-Paste Configs

{
  "semantic": {
    "enabled": true,
    "provider": "local",
    "model": "all-MiniLM-L6-v2",
    "generateSummaries": true,
    "summaryProvider": "api",
    "summaryModel": "claude-haiku-4-5-20251001"
  }
}

Build docs developers (and LLMs) love