Skip to main content
Lerim operates through two core processes: sync extracts new memories from agent sessions, and maintain refines the memory store over time. This page explains both processes in detail.

The sync process

Sync discovers agent sessions, extracts decision and learning candidates, deduplicates against existing memories, and writes new memory files.

How sync works

1

Discover sessions

Platform adapters scan agent storage directories for new or modified sessions:
  • Claude: ~/.claude/projects/*.jsonl
  • Codex: ~/.codex/sessions/*.jsonl
  • Cursor: ~/Library/Application Support/Cursor/User/globalStorage/*/state.vscdb
  • OpenCode: ~/.local/share/opencode/opencode.db
Sessions are indexed in ~/.lerim/index/sessions.sqlite3 with content hashes for change detection.
2

Queue sessions

New sessions are added to the extraction queue. Hash-based change detection skips unchanged sessions.
3

Read transcript

The lead agent loads one session transcript from the queue. For SQLite-based platforms (Cursor, OpenCode), sessions are exported to JSONL cache files first.
4

Extract candidates

The DSPy extraction pipeline analyzes the transcript:
  • Transcript is split into overlapping windows (default 300K tokens per window)
  • Each window is processed with dspy.ChainOfThought using the MemoryExtractSignature
  • Per-window candidates are merged and deduplicated in a final ChainOfThought pass
  • Output: structured list of MemoryCandidate objects with primitive type, title, body, confidence, tags, and kind
5

Deduplicate

The explorer subagent searches existing memories to find similar entries:
  • Uses glob and grep to search decision and learning files
  • Compares candidate titles and bodies against existing memories
  • Returns list of similar memory paths for the lead agent to review
6

Decide action

The lead agent runs a deterministic decision policy for each candidate:
  • add — Create a new memory file (no similar memory exists, or quality is higher)
  • update — Merge with existing memory (similar exists, new info is complementary)
  • no-op — Skip (duplicate or low value)
7

Write memories

The lead agent writes approved candidates to disk:
  • New memories: memory/{primitive}/{YYYYMMDD}-{slug}.md
  • Updates: Existing file is edited with merged content and updated timestamp
  • Session summary: memory/summaries/{YYYYMMDD}/{HHMMSS}/{slug}.md
All writes go through the write_memory tool, which enforces frontmatter schemas and security boundaries.
8

Log results

Sync results are logged to:
  • ~/.lerim/activity.log — One line per cycle with project, stats, cost, duration
  • Workspace artifacts in <repo>/.lerim/workspace/sync-{timestamp}-{id}/

Running sync

When you run lerim up, sync runs continuously in the background:
lerim up
# Daemon automatically syncs new sessions every few minutes
Check sync status:
lerim status
lerim logs --follow
The first sync can take a while if you have many sessions. Use --max-sessions to limit the initial sync, then let the daemon catch up over time.

Sync output

Each sync run creates a workspace folder with detailed artifacts:
<repo>/.lerim/workspace/sync-20260220-120000-abc123/
  extract.json          # All extracted candidates
  summary.json          # Session summary with metadata
  memory_actions.json   # What was written (add/update/no-op)
  agent.log            # Lead agent trace
  subagents.log        # Explorer subagent trace
  session.log          # Session processing log
Use these files for debugging extraction quality or understanding why a memory was or wasn’t created.

Extraction quality

The extraction pipeline is configured with role-specific models:
# ~/.lerim/config.toml
[roles.extract]
provider = "openrouter"
model = "openai/gpt-5-nano"
max_window_tokens = 300000
window_overlap_tokens = 1000
Use a model with a large context window for extraction (300K+). The default openai/gpt-5-nano via OpenRouter is optimized for cost and speed.

The maintain process

Maintain runs offline refinement over stored memories: merges duplicates, archives low-value entries, consolidates related memories, and applies time-based decay.

How maintain works

1

Scan memories

Load all existing decision and learning files from memory/decisions/ and memory/learnings/.
2

Identify duplicates

The lead agent compares memories to find near-duplicates:
  • Same primitive type
  • Similar titles or content
  • Overlapping tags
3

Merge duplicates

For each duplicate group:
  • Keep the memory with highest confidence
  • Merge complementary information from others
  • Move merged-from memories to memory/archived/{primitive}/
4

Calculate effective confidence

For each memory, compute effective confidence with time-based decay:
days_since_access = (now - last_accessed).days
decay_factor = min(days_since_access / 180, 1.0)
effective_conf = base_confidence * (1 - decay_factor)
effective_conf = max(effective_conf, 0.1)  # floor
5

Archive low-value memories

Memories with effective confidence below 0.2 are moved to memory/archived/{primitive}/, unless:
  • Accessed in the last 30 days (grace period)
  • Created in the last 30 days
6

Consolidate related memories

Identify memories that reference each other or share many tags, and optionally create consolidated entries.
7

Log results

Maintain results are logged to:
  • ~/.lerim/activity.log
  • Workspace artifacts in <repo>/.lerim/workspace/maintain-{timestamp}-{id}/

Running maintain

When you run lerim up, maintain runs periodically (default: every 24 hours):
lerim up
# Daemon automatically runs maintain once per day
Maintain is safe to run frequently — it’s non-destructive (archived memories are soft-deleted, not removed).

Maintain output

Each maintain run creates a workspace folder:
<repo>/.lerim/workspace/maintain-20260223-140000-xyz789/
  maintain_actions.json  # What was merged/archived
  agent.log             # Lead agent trace
  subagents.log         # Explorer subagent trace

Decay configuration

Configure decay behavior in your config file:
# ~/.lerim/config.toml
[memory.decay]
decay_period_days = 180      # Full decay after 6 months
confidence_floor = 0.1        # Never drop below this
archive_threshold = 0.2       # Archive when below this
grace_period_days = 30        # Recently accessed memories skip decay
Increase grace_period_days if you want to protect recently used memories from archiving. Decrease archive_threshold to be more aggressive about cleaning up low-value memories.

When to use each command

Use sync when:

  • You’ve completed coding sessions and want to extract learnings immediately
  • You’re setting up Lerim for the first time (lerim sync --max-sessions 5)
  • You’ve manually added sessions to an agent’s storage directory
  • You want to force-reprocess a specific session

Use maintain when:

  • You notice duplicate memories in the dashboard or search results
  • You want to clean up old, unused memories
  • You’ve accumulated hundreds of memories and want to consolidate them
  • You’re preparing to share a project’s .lerim/ directory with a team

Use the daemon (lerim up) when:

  • You want continuous, automatic memory extraction and refinement
  • You’re actively using coding agents and want minimal manual intervention
  • You want the dashboard running for browsing memories and sessions

Workflow examples

First-time setup

# Install and configure
pip install lerim
lerim init
lerim project add .

# Sync recent sessions only
lerim up
lerim sync --max-sessions 5

# Let daemon handle future sessions
lerim logs --follow

Manual extraction cycle

# Connect platforms
lerim connect auto

# One-shot sync
lerim sync

# Clean up duplicates
lerim maintain

# Query memories
lerim ask "What auth approach did we choose?"

Debugging extraction quality

# Run sync with verbose logging
LERIM_TRACING=1 lerim sync --max-sessions 1

# Check workspace artifacts
ls -la .lerim/workspace/sync-*/
cat .lerim/workspace/sync-*/extract.json | jq

# View agent trace
cat .lerim/workspace/sync-*/agent.log

Clean slate reset

# Reset everything
lerim memory reset --scope both --yes

# Re-sync from scratch
lerim sync --max-sessions 10

# Run maintain to deduplicate
lerim maintain

Performance tuning

Sync performance

Lerim automatically skips sessions that haven’t changed since the last sync. Content hashes are stored in sessions.sqlite3.If you’re re-syncing the same sessions repeatedly, make sure your session files are stable (not being rewritten by the agent).
Large context windows are more accurate but slower. Reduce max_window_tokens for faster sync:
[roles.extract]
max_window_tokens = 150000  # Half the default
Trade accuracy for speed by using faster models:
[roles.extract]
provider = "openrouter"
model = "openai/gpt-4o-mini"  # Much faster than gpt-5-nano
If you have thousands of sessions, process them in batches:
lerim sync --max-sessions 50

Maintain performance

If you have a large memory store, reduce maintain frequency:
[daemon]
maintain_interval_hours = 168  # Once per week instead of daily
Reduce the memory set size by lowering the archive threshold:
[memory.decay]
archive_threshold = 0.15  # Archive more aggressively

Observability

Both sync and maintain emit detailed traces when tracing is enabled:
# Enable tracing
export LERIM_TRACING=1

# Or in config
# ~/.lerim/config.toml
[tracing]
enabled = true
include_content = true
View traces at logfire.pydantic.dev to see:
  • Model calls and token usage
  • Tool calls and results
  • Agent reasoning steps
  • Timing and latency
  • LLM costs per operation
Tracing is invaluable for debugging extraction quality. If a memory wasn’t created or was incorrectly merged, the trace shows exactly why.

What’s next?

Supported agents

Learn which coding agents are supported and how to connect them

Configuration

Configure models, roles, decay, and daemon behavior

CLI reference

Explore all sync and maintain CLI options

Troubleshooting

Debug common sync and maintain issues

Build docs developers (and LLMs) love