Memory system

GenosOS has persistent memory across all sessions. Your assistant remembers your preferences, projects, and context — and carries that knowledge forward across restarts, long breaks, and new conversations.

Storage backends

The memory system uses two backends in tandem:

Backend	Purpose
QMD	Primary session transcript storage (structured)
SQLite	Index and metadata (via `bun:sqlite`)

Vector search is provided by sqlite-vec — an embedded extension that runs inside SQLite with no external service required. Long-term recall uses the memory-lancedb extension (LanceDB) for high-dimensional vector similarity.

TOON compaction

After a session grows long, GenosOS compacts it into a TOON-format summary. TOON (a deterministic 11-section encoding format) reduces session size by approximately 40% compared to raw markdown while preserving meaning with zero information degradation.

What TOON does

Deterministic structure. 11 sections per summary covering both technical state (tools used, config changes, decisions made) and relational context (tone, preferences, recurring topics).
Cold-restart prevention. Reentry after a long break feels like picking up a mid-sentence conversation — not starting fresh. The compacted summary injects full context before the first token of the new session.
Iterative validation. Compaction has been validated across 4 successive rounds with zero information degradation.
Benchmark: 13µs per TOON encode call — negligible overhead.

Triggering compaction

Compaction runs automatically when a session reaches its configured size threshold. You can also trigger it manually at any time:

/compact

Or ask the agent:

"Compact this session"
"Summarize what we've done so far"

Semantic prefetch

Before every agent response, GenosOS automatically queries the memory system for relevant context and injects it into the prompt.

The query embedding from the prefetch is reused for semantic tool filtering — zero extra API calls.
Relevant memory documents (from past sessions, structured notes, and workspace files) are ranked by vector similarity and injected selectively.
Users do not configure this. It is always on.

Semantic tool filtering

Tools are filtered by embedding similarity to the user’s intent before being sent to the model.

Tool category	Visibility rule
Core tools	Always visible (`read`, `write`, `exec`, `bash`)
Domain tools	Appear only when semantically relevant to the request

Domain tool examples:

Tool	Appears when user asks about…
`browser`	Web pages, links, scraping, research
`canvas`	Visual output, diagrams, images
`cron`	Scheduling, reminders, recurring tasks
`image`	Image generation, visual tasks

Filtering typically saves 2,000–3,000 tokens per request by removing tool definitions the model does not need for the current request.

Structured memory documents

Long-term notes are stored as structured markdown files in the agent workspace:

~/.genosv1/agents/{uuid}/memory/YYYY-MM-DD.md

Each file follows an 8-section schema covering facts, preferences, decisions, context, and open questions. The agent reads and writes these files using standard file tools — they are encrypted at rest with NYXENC1.

Vector search

The memory-lancedb extension provides vector search for long-term recall:

Embeddings are generated from session content and stored in LanceDB
Similarity search retrieves relevant past sessions, notes, and context
Runs entirely on your machine — no embeddings are sent to external services beyond the LLM API call that generates them

Memory system summary

Feature	Implementation	Notes
Session transcripts	QMD + SQLite (encrypted)	Per-channel-peer isolation
Vector search	sqlite-vec (embedded) + LanceDB extension	No external service required
Compaction	TOON format (11 sections, ~40% token reduction)	13µs/call, validated 4× iterative
Semantic prefetch	Embedding similarity → context injection	Zero extra API calls
Tool filtering	Embedding similarity → tool subset	Reuses prefetch embedding, 2K–3K token savings
Structured notes	`memory/YYYY-MM-DD.md` (8-section schema)	Encrypted at rest
Long-term recall	LanceDB (`memory-lancedb` extension)	High-dimensional vector similarity

CLI Reference

Architecture

Storage backends

TOON compaction

What TOON does

Triggering compaction

Semantic prefetch

Semantic tool filtering

Structured memory documents

Vector search

Memory system summary

CLI Reference

Architecture

​Storage backends

​TOON compaction

​What TOON does

​Triggering compaction

​Semantic prefetch

​Semantic tool filtering

​Structured memory documents

​Vector search

​Memory system summary

Storage backends

TOON compaction

What TOON does

Triggering compaction

Semantic prefetch

Semantic tool filtering

Structured memory documents

Vector search

Memory system summary