Quality Tiers
The summary system has three quality tiers. Each builds on the previous one:| Tier | Embedding Model | Summaries | Search Quality | Cost |
|---|---|---|---|---|
| Low (default) | all-MiniLM-L6-v2 (384-dim) | None | Baseline | Free |
| Medium | nomic-embed-text-v1.5 (768-dim) | None | Better | Free |
| High | Either model | LLM-generated | Best | API cost |
- Low — embeds raw symbol text (name + kind + signature) with a general-purpose model. No API calls. The
all-MiniLM-L6-v2model is bundled with the npm package (~22 MB). - Medium — swaps in a higher-quality text embedding model with a longer context window (8,192 tokens vs 256). Fully offline. Downloads ~138 MB on first use.
- High — adds LLM-generated natural-language summaries to either embedding model. Both models are text-based and benefit equally from summaries. This produces the best search results because the LLM distills code meaning into plain English that embedding models handle well.
Semantic search (
semantic.enabled) is true by default with provider: "local" using the ONNX runtime. Keep generateSummaries disabled until you validate summary quality for your repository.Upgrading from all-MiniLM to nomic-embed-text-v1.5
To move from Low to Medium tier, change the model and run a full re-index:nomic-embed-text-v1.5 is downloaded automatically from HuggingFace (~138 MB). To pre-download it:
| Platform | Path |
|---|---|
| Windows | %LOCALAPPDATA%\sdl-mcp\models\nomic-embed-text-v1.5\ |
| macOS | ~/.cache/sdl-mcp/models/nomic-embed-text-v1.5/ |
| Linux | ~/.cache/sdl-mcp/models/nomic-embed-text-v1.5/ |
| Custom | Set semantic.modelCacheDir in config |
Summary Providers
Summary generation is independent from the embedding provider. You can use local embeddings with an API-based summary provider, or vice versa.- Anthropic API
- Ollama (Local)
- OpenAI-Compatible Servers
- Mock (Testing / CI)
Uses Claude models via the Anthropic Messages API. Highest quality, no local GPU needed.Recommended models:
For most repositories, Haiku is the best balance of cost and quality. Each symbol uses roughly 50–150 input tokens.Key resolution order: Highest quality configuration (nomic embeddings + Anthropic summaries):Requires
| Model | Speed | Quality | Pricing |
|---|---|---|---|
claude-haiku-4-5-20251001 | Fast | Good (default) | 1.25 per 1M tokens |
claude-sonnet-4-20250514 | Medium | Higher | 15 per 1M tokens |
Get an API key
Sign up at console.anthropic.com, go to API Keys, and create a new key (starts with
sk-ant-).Set the API key
Set via environment variable (recommended for shared configs):Or set inline in config (not recommended for shared configs):
Configure
summaryApiKey in config → ANTHROPIC_API_KEY env var. If neither is set, summary generation is skipped and existing cached summaries are preserved.ANTHROPIC_API_KEY. Downloads ~138 MB embedding model on first run.Tuning Batch Processing
Summary generation processes symbols in parallel batches. Adjust these settings based on your provider’s rate limits and hardware:| Setting | Default | Range | Description |
|---|---|---|---|
summaryBatchSize | 20 | 1–50 | Symbols processed per batch |
summaryMaxConcurrency | 5 | 1–20 | Batches running in parallel |
summaryMaxConcurrency to 3 if you hit rate limits on a free-tier key.
For Ollama on CPU — set summaryMaxConcurrency: 1 and summaryBatchSize: 10 to avoid overwhelming your machine.
For Ollama on GPU — defaults are fine. Increase summaryMaxConcurrency to 8–10 if your GPU has headroom.
Verifying Summaries After Indexing
After indexing with summaries enabled, verify that summaries appear in symbol cards:summary field should contain a natural-language description instead of a heuristic placeholder. The index output also reports summary stats:
summarySource on cards to see how each summary was produced:
summarySource | summaryQuality | Meaning |
|---|---|---|
"jsdoc" | 1.0 | Extracted from JSDoc / doc comment |
"llm" | 0.8 | LLM-generated (Claude Haiku, Ollama) |
"nn-direct:<id>" | 0.6 | Transferred from a similar symbol (similarity ≥ 0.85) |
"nn-adapted:<id>" | 0.5 | Adapted from a similar symbol (similarity 0.70–0.85) |
"heuristic-typed" | 0.4 | Pattern-matched from name + param types |
"heuristic-fallback" | 0.3 | Pattern-matched from name + kind only |