Storage backends
The memory system uses two backends in tandem:| Backend | Purpose |
|---|---|
| QMD | Primary session transcript storage (structured) |
| SQLite | Index and metadata (via bun:sqlite) |
memory-lancedb extension (LanceDB) for high-dimensional vector similarity.
TOON compaction
After a session grows long, GenosOS compacts it into a TOON-format summary. TOON (a deterministic 11-section encoding format) reduces session size by approximately 40% compared to raw markdown while preserving meaning with zero information degradation.What TOON does
- Deterministic structure. 11 sections per summary covering both technical state (tools used, config changes, decisions made) and relational context (tone, preferences, recurring topics).
- Cold-restart prevention. Reentry after a long break feels like picking up a mid-sentence conversation — not starting fresh. The compacted summary injects full context before the first token of the new session.
- Iterative validation. Compaction has been validated across 4 successive rounds with zero information degradation.
- Benchmark: 13µs per TOON encode call — negligible overhead.
Triggering compaction
Compaction runs automatically when a session reaches its configured size threshold. You can also trigger it manually at any time:Semantic prefetch
Before every agent response, GenosOS automatically queries the memory system for relevant context and injects it into the prompt.- The query embedding from the prefetch is reused for semantic tool filtering — zero extra API calls.
- Relevant memory documents (from past sessions, structured notes, and workspace files) are ranked by vector similarity and injected selectively.
- Users do not configure this. It is always on.
Semantic tool filtering
Tools are filtered by embedding similarity to the user’s intent before being sent to the model.| Tool category | Visibility rule |
|---|---|
| Core tools | Always visible (read, write, exec, bash) |
| Domain tools | Appear only when semantically relevant to the request |
| Tool | Appears when user asks about… |
|---|---|
browser | Web pages, links, scraping, research |
canvas | Visual output, diagrams, images |
cron | Scheduling, reminders, recurring tasks |
image | Image generation, visual tasks |
Structured memory documents
Long-term notes are stored as structured markdown files in the agent workspace:Vector search
Thememory-lancedb extension provides vector search for long-term recall:
- Embeddings are generated from session content and stored in LanceDB
- Similarity search retrieves relevant past sessions, notes, and context
- Runs entirely on your machine — no embeddings are sent to external services beyond the LLM API call that generates them
Memory system summary
| Feature | Implementation | Notes |
|---|---|---|
| Session transcripts | QMD + SQLite (encrypted) | Per-channel-peer isolation |
| Vector search | sqlite-vec (embedded) + LanceDB extension | No external service required |
| Compaction | TOON format (11 sections, ~40% token reduction) | 13µs/call, validated 4× iterative |
| Semantic prefetch | Embedding similarity → context injection | Zero extra API calls |
| Tool filtering | Embedding similarity → tool subset | Reuses prefetch embedding, 2K–3K token savings |
| Structured notes | memory/YYYY-MM-DD.md (8-section schema) | Encrypted at rest |
| Long-term recall | LanceDB (memory-lancedb extension) | High-dimensional vector similarity |