Search architecture
Claude-Mem uses an MCP-based search architecture that provides intelligent memory retrieval through 4 streamlined tools following a 3-layer progressive disclosure workflow.Overview
MCP tools
4 tools:
search, timeline, get_observations, __IMPORTANTMCP server
Thin wrapper (~312 lines) that translates MCP protocol to HTTP API calls
HTTP API
Fast search operations on Worker Service at port 37777
Hybrid search
SQLite FTS5 for keyword search, ChromaDB for semantic/vector search
How it works
User query
Claude has 4 MCP tools available. When searching memory, it follows the 3-layer workflow:
The 4 MCP tools
__IMPORTANT — Workflow documentation
Always visible to Claude. Explains the 3-layer workflow pattern and enforces it at the tool level.
search — Search memory index
Step 1 of the workflow. Returns a compact index for filtering.
GET /api/search
| Parameter | Description |
|---|---|
query | Full-text search query |
limit | Maximum results (default: 20) |
type | Filter by observation type |
project | Filter by project name |
dateStart, dateEnd | Date range filters |
offset | Pagination offset |
orderBy | Sort order |
timeline — Get chronological context
Step 2 of the workflow. Reveals the narrative arc around a specific observation.
GET /api/timeline
| Parameter | Description |
|---|---|
anchor | Observation ID to center timeline around |
query | Search query to find anchor automatically |
depth_before | Observations before anchor (default: 3) |
depth_after | Observations after anchor (default: 3) |
project | Filter by project name |
anchor or query must be provided. Returns: Chronological view of what happened before, during, and after the anchor point.
get_observations — Fetch full details
Step 3 of the workflow. Fetches complete data only for IDs pre-filtered in steps 1–2.
POST /api/observations/batch
MCP server implementation
Location:plugin/scripts/mcp-server.cjs
The MCP server is a thin wrapper — it contains no business logic. Its sole job is protocol translation from MCP JSON-RPC to HTTP API calls.
Key characteristics:
- ~312 lines of code (reduced from ~2,718 lines in the previous implementation)
- Single source of truth: the Worker HTTP API
- Simple schemas with
additionalProperties: true
Hybrid search approach
FTS5 keyword search (SQLite)
SQLite FTS5 virtual tables provide fast full-text keyword matching:Semantic search (ChromaDB)
ChromaDB provides vector embeddings for semantic similarity search — finding conceptually related observations even when exact keywords don’t match. TheChromaSync service (src/services/sync/ChromaSync.ts) manages synchronization between SQLite and ChromaDB.
ChromaDB is optional. When unavailable, search falls back to FTS5 keyword search and SQL
LIKE queries.Search routing
TheSessionSearch service (src/services/sqlite/SessionSearch.ts) coordinates search routing:
- Vector search via ChromaDB is the primary search mechanism
- FTS5 is maintained for backward compatibility; tables are kept synchronized via triggers
- Structured filters (type, project, date) are applied as SQL predicates regardless of search mode
The 3-layer progressive disclosure pattern
Design philosophy
Progressive disclosure is a core architectural principle: reveal information at the level of detail actually needed, on demand.- Layer 1: Index (search)
- Layer 2: Context (timeline)
- Layer 3: Details (get_observations)
What: Compact table with IDs, titles, dates, typesCost: ~50–100 tokens per resultPurpose: Survey what exists before committing tokensDecision point: “Which observations are relevant?”
Token efficiency
Traditional RAG
Fetch 20 observations upfront: 10,000–20,000 tokensRelevance: ~10% (only 2 observations actually useful)Waste: 18,000 tokens on irrelevant context
3-layer workflow
Step 1: search (20 results) = ~1,000–2,000 tokensStep 2: filter to 3 relevant IDsStep 3: get_observations (3 IDs) = ~1,500–3,000 tokensTotal: 2,500–5,000 tokens (50–75% savings)
Structural enforcement
The 3-layer pattern is enforced by tool design, not just instructions:- You cannot fetch full details without first getting IDs from
search - You cannot search without seeing the workflow reminder in
__IMPORTANT timelineprovides a middle ground between index and full details
Before: Progressive disclosure was something Claude had to remember. After: Progressive disclosure is structurally impossible to bypass.
Architecture evolution
Before: Complex MCP implementation (9 tools)
Before: Complex MCP implementation (9 tools)
Approach: 9 MCP tools with detailed parameter schemasToken cost: ~2,500 tokens in tool definitions per sessionTools:
search_observations— Full-text searchfind_by_type— Filter by typefind_by_file— Filter by filefind_by_concept— Filter by conceptget_recent_context— Recent sessionsget_observation— Fetch single observationget_session— Fetch sessionget_prompt— Fetch prompthelp— API documentation
mcp-server.tsAfter: Streamlined MCP implementation (4 tools)
After: Streamlined MCP implementation (4 tools)
Approach: 4 MCP tools following the 3-layer workflowTools:
__IMPORTANT— Workflow guidance (always visible)search— Step 1 (index)timeline— Step 2 (context)get_observations— Step 3 (details)
additionalProperties: true schemas, clear workflow pattern.Code size: ~312 lines in mcp-server.ts (88% reduction)Previous: Skill-based approach
Previous: Skill-based approach
Earlier versions (v5.4.0–v5.5.0) used a skill-based search approach:
- Required separate
SKILL.mdandoperations/files - HTTP API called directly via curl from skill instructions
- Progressive disclosure through skill loading (loaded on-demand)
- Token savings: ~2,250 tokens per session vs the old MCP approach
Configuration
- Claude Code
- Claude Desktop
MCP server is automatically configured via plugin installation. No manual setup required.
Security
FTS5 injection prevention
All search queries are escaped before FTS5 processing:MCP protocol security
- Stdio transport: No network exposure for the MCP protocol
- Local-only HTTP: Worker API is bound to
localhost:37777 - No authentication: Local development only, no external network access
Performance
| Aspect | Detail |
|---|---|
| FTS5 query latency | Sub-10ms for typical queries |
| MCP overhead | Minimal — simple protocol translation only |
| Pagination | Efficient with offset / limit |
| Batching | get_observations accepts multiple IDs in a single call |
Troubleshooting
MCP server not connected (tools not appearing in Claude)
MCP server not connected (tools not appearing in Claude)
- Verify the MCP server path in your configuration
- Check that the worker service is running:
- Restart Claude Desktop or Claude Code
Worker service not running (MCP tools fail with connection errors)
Worker service not running (MCP tools fail with connection errors)
Empty search results
Empty search results
- Test the API directly:
- Verify the database exists:
- Confirm observations exist:
Related pages
- Worker Service — HTTP API endpoint reference
- Database Architecture — FTS5 tables, indexes, and schema
- Architecture Overview — System components and data flow