The sync process
Sync discovers agent sessions, extracts decision and learning candidates, deduplicates against existing memories, and writes new memory files.How sync works
Discover sessions
Platform adapters scan agent storage directories for new or modified sessions:
- Claude:
~/.claude/projects/*.jsonl - Codex:
~/.codex/sessions/*.jsonl - Cursor:
~/Library/Application Support/Cursor/User/globalStorage/*/state.vscdb - OpenCode:
~/.local/share/opencode/opencode.db
~/.lerim/index/sessions.sqlite3 with content hashes for change detection.Queue sessions
New sessions are added to the extraction queue. Hash-based change detection skips unchanged sessions.
Read transcript
The lead agent loads one session transcript from the queue. For SQLite-based platforms (Cursor, OpenCode), sessions are exported to JSONL cache files first.
Extract candidates
The DSPy extraction pipeline analyzes the transcript:
- Transcript is split into overlapping windows (default 300K tokens per window)
- Each window is processed with
dspy.ChainOfThoughtusing theMemoryExtractSignature - Per-window candidates are merged and deduplicated in a final ChainOfThought pass
- Output: structured list of
MemoryCandidateobjects with primitive type, title, body, confidence, tags, and kind
Deduplicate
The explorer subagent searches existing memories to find similar entries:
- Uses
globandgrepto search decision and learning files - Compares candidate titles and bodies against existing memories
- Returns list of similar memory paths for the lead agent to review
Decide action
The lead agent runs a deterministic decision policy for each candidate:
- add — Create a new memory file (no similar memory exists, or quality is higher)
- update — Merge with existing memory (similar exists, new info is complementary)
- no-op — Skip (duplicate or low value)
Write memories
The lead agent writes approved candidates to disk:
- New memories:
memory/{primitive}/{YYYYMMDD}-{slug}.md - Updates: Existing file is edited with merged content and updated timestamp
- Session summary:
memory/summaries/{YYYYMMDD}/{HHMMSS}/{slug}.md
write_memory tool, which enforces frontmatter schemas and security boundaries.Running sync
- Automatic (daemon)
- Manual (one-shot)
- Direct (no server)
When you run Check sync status:
lerim up, sync runs continuously in the background:The first sync can take a while if you have many sessions. Use
--max-sessions to limit the initial sync, then let the daemon catch up over time.Sync output
Each sync run creates a workspace folder with detailed artifacts:Extraction quality
The extraction pipeline is configured with role-specific models:The maintain process
Maintain runs offline refinement over stored memories: merges duplicates, archives low-value entries, consolidates related memories, and applies time-based decay.How maintain works
Scan memories
Load all existing decision and learning files from
memory/decisions/ and memory/learnings/.Identify duplicates
The lead agent compares memories to find near-duplicates:
- Same primitive type
- Similar titles or content
- Overlapping tags
Merge duplicates
For each duplicate group:
- Keep the memory with highest confidence
- Merge complementary information from others
- Move merged-from memories to
memory/archived/{primitive}/
Archive low-value memories
Memories with effective confidence below 0.2 are moved to
memory/archived/{primitive}/, unless:- Accessed in the last 30 days (grace period)
- Created in the last 30 days
Consolidate related memories
Identify memories that reference each other or share many tags, and optionally create consolidated entries.
Running maintain
- Automatic (daemon)
- Manual (one-shot)
- Direct (no server)
When you run
lerim up, maintain runs periodically (default: every 24 hours):Maintain is safe to run frequently — it’s non-destructive (archived memories are soft-deleted, not removed).
Maintain output
Each maintain run creates a workspace folder:Decay configuration
Configure decay behavior in your config file:When to use each command
Use sync when:
- You’ve completed coding sessions and want to extract learnings immediately
- You’re setting up Lerim for the first time (
lerim sync --max-sessions 5) - You’ve manually added sessions to an agent’s storage directory
- You want to force-reprocess a specific session
Use maintain when:
- You notice duplicate memories in the dashboard or search results
- You want to clean up old, unused memories
- You’ve accumulated hundreds of memories and want to consolidate them
- You’re preparing to share a project’s
.lerim/directory with a team
Use the daemon (lerim up) when:
- You want continuous, automatic memory extraction and refinement
- You’re actively using coding agents and want minimal manual intervention
- You want the dashboard running for browsing memories and sessions
Workflow examples
First-time setup
Manual extraction cycle
Debugging extraction quality
Clean slate reset
Performance tuning
Sync performance
Use hash-based change detection
Use hash-based change detection
Lerim automatically skips sessions that haven’t changed since the last sync. Content hashes are stored in
sessions.sqlite3.If you’re re-syncing the same sessions repeatedly, make sure your session files are stable (not being rewritten by the agent).Reduce window size for faster extraction
Reduce window size for faster extraction
Large context windows are more accurate but slower. Reduce
max_window_tokens for faster sync:Use faster models
Use faster models
Trade accuracy for speed by using faster models:
Limit max sessions per sync
Limit max sessions per sync
If you have thousands of sessions, process them in batches:
Maintain performance
Run maintain less frequently
Run maintain less frequently
If you have a large memory store, reduce maintain frequency:
Archive aggressively
Archive aggressively
Reduce the memory set size by lowering the archive threshold:
Observability
Both sync and maintain emit detailed traces when tracing is enabled:- Model calls and token usage
- Tool calls and results
- Agent reasoning steps
- Timing and latency
- LLM costs per operation
What’s next?
Supported agents
Learn which coding agents are supported and how to connect them
Configuration
Configure models, roles, decay, and daemon behavior
CLI reference
Explore all sync and maintain CLI options
Troubleshooting
Debug common sync and maintain issues