Skip to main content
Most RAG systems are dumb vector lookups — you embed a query, pull the nearest chunks, and hope. GenieHelper’s memory subsystem is architecturally different. It treats memory as a set of five distinct layers, each with a specific role, and routes every retrieval request through a multi-stage pipeline that combines keyword search, semantic similarity, graph propagation, and entropy-based pruning before a single token reaches the agent prompt.

The five memory layers

Working memory

The agent’s active context window for the current session. Holds the live conversation, recent tool results, and injected skill context. Cleared between sessions.

Core memory

The DuckDB skill graph (memory/core/agent_memory.duckdb). 191 procedural skills across 11 categories, stored as 252 nodes and 12,880+ weighted edges. Queried at session start via JIT hydration.

Procedural memory

Executable skill modules in agent-skills/. Each skill is a Markdown document describing how to perform a specific task — scheduling posts, managing fan DMs, running analytics. Ingested into DuckDB by ingest_skills.py.

Archival memory

Long-term curated summaries of each creator’s history, preferences, and performance patterns. Injected into the agent system prompt at workspace initialization. (Planned — not yet implemented.)

Recall memory

Verbatim session logs stored for exact retrieval — not semantic. Used when the agent needs to recall a specific fact said in a prior conversation rather than a semantically similar one. (Planned — not yet implemented.)
Working memory, Core memory, and Procedural memory are fully active. Archival and Recall are designed and structurally ready but not yet implemented.

Storage backends

StoreTechnologyContents
Skill graphDuckDB (agent_memory.duckdb)191 skills · 252 nodes · 12,880+ edges
Taxonomy graphJSON (Nodes/Universe/taxonomy_graph.json)3,205 nodes · CO_OCCURS edge weights
Vector storePostgreSQL pgvectorSemantic embeddings for document chunks
Creator nodesDirectus user_nodes collectionPer-creator Hebbian-weighted concept activations
Node tiersNodes/Universe/ · Nodes/User/ · Nodes/Transitional/Graph promotion pipeline

The retrieval pipeline

Every time the agent needs context — whether for answering a creator question, generating content, or making a scheduling decision — the following pipeline runs:
Creator query


[1] HyDE — generate hypothetical ideal answer, embed that instead of the raw query


[2] Dense retrieval — pgvector cosine similarity against embedded document chunks
[2] BM25 sparse retrieval — keyword term matching against Nodes/ JSON files


[3] RRF fusion — Reciprocal Rank Fusion merges both rank lists (k=60)


[4] Synaptic propagation — seed nodes activate connected taxonomy nodes via LIF neurons


[5] Shannon entropy gate — low-information nodes evicted; budget-constrained context assembled


[6] CRAG validation — retrieved context graded; low-confidence results trigger web search or HITL escalation


Validated context injected into agent prompt
1

HyDE: hypothetical document embeddings

Before the query ever touches the vector store, the agent generates a hypothetical ideal answer to the query and embeds that instead of the raw query text. This bridges the semantic distance between how a creator phrases a question and how the relevant documents are written. See HyDE retrieval.
2

Dual retrieval: dense + sparse

Two independent retrieval methods run in parallel. Dense retrieval uses pgvector cosine similarity to find semantically related chunks. BM25 sparse retrieval runs keyword term matching against node JSON files — catching exact platform names, creator handles, and domain-specific terminology that vector search misses.
3

RRF fusion

The two ranked lists are merged using Reciprocal Rank Fusion without requiring a trained reranker. Each document’s score is Σ 1/(60 + rank_i) across both lists, so strong agreement between methods pushes results to the top. See RRF fusion.
4

Synaptic propagation

The top-ranked nodes from RRF become seed nodes. A Leaky Integrate-and-Fire neuron model propagates activation through the taxonomy graph — nodes semantically adjacent to the seeds accumulate charge and fire into context. A query about “beach yoga” automatically surfaces fitness, outdoor, and lifestyle nodes without those terms appearing in the query. See Synaptic propagation.
5

Shannon entropy gating

The expanded candidate set is pruned to fit within the context budget. Nodes are scored by Shannon entropy — high-redundancy, low-information chunks are evicted first. An eviction_report records what was dropped and why. See Entropy gating.
6

CRAG validation

Retrieved context is graded for relevance before injection. Low-confidence retrievals do not silently pass through — they trigger a web search fallback or escalation to the human-in-the-loop queue. The agent knows what it doesn’t know. See Entropy gating.

The nightly consolidation cycle

Retrieval only works if the knowledge graph stays current. Every night, a consolidation cycle runs two processes:
Time (UTC)ProcessWhat it does
02:00node-decay.mjsApplies Hebbian decay to all user_nodes in Directus — recently activated nodes strengthen, dormant nodes decay toward status: 'decayed'
02:30taxonomy-hebbian.mjsBoosts CO_OCCURS edge weights between tags that co-appeared in creator content; applies global decay to the taxonomy graph
02:45taxonomy-reconcile.mjsClassifies pending_review tags as active
03:00fp_growth.mjs(Planned) Cross-creator FP-Growth pattern mining promotes collective patterns to Nodes/Universe/
The consolidation cycle is the feedback loop that makes GenieHelper self-improving. Without it, the taxonomy graph would accumulate dead weight from concepts creators tried once and abandoned, and never capture genuine trends emerging across the creator community.

Node tiers

Knowledge nodes flow through three tiers:
Nodes/User/{creator-uuid}/     ← per-creator activation records

         │  FP-Growth cross-user pattern mining (nightly)

Nodes/Transitional/            ← patterns awaiting promotion review

         │  Promotion to universal graph

Nodes/Universe/                ← canonical 3,205-node taxonomy graph

JIT skill hydration

Skills are not preloaded. At session start, surgical_context.py activate "<task>" runs stimulus propagation across the DuckDB skill graph and surfaces the top-N most relevant procedural skills for the current task. Only the relevant skills consume context window tokens.
python3 memory/core/surgical_context.py activate "schedule Instagram posts for this week"
This queries the 191-skill graph (252 nodes, 12,880+ edges) and returns the ranked subset of skills most relevant to scheduling — without loading the full skill library.

Implementation map

Skill graph

memory/core/ingest_skills.py, surgical_context.py, agent_memory.duckdb

Hybrid retrieval

memory/retrieval/rrf/, synaptic/, entropy/

Nightly consolidation

memory/consolidation/hebbian/node-decay.mjs, taxonomy-hebbian.mjs

Cross-user patterns

memory/consolidation/cross_user/ — FP-Growth mining (planned)

Build docs developers (and LLMs) love