Skip to main content

Overview

OpenFang’s memory substrate provides a unified API over three specialized storage backends, all backed by a single SQLite database. Agents interact with a single Memory trait that abstracts over structured KV storage, semantic search, and knowledge graphs.
The entire memory system is synchronous at the storage layer (SQLite) but exposed via async APIs to the runtime. This allows non-blocking I/O while maintaining ACID guarantees.

The Three Layers

Structured Store

Key-value pairs, agent state, session persistence. Fast lookups by exact key.

Semantic Store

Text-based memory fragments with vector embeddings. Recalls relevant memories by similarity.

Knowledge Graph

Entities and relations extracted from conversations. Enables graph queries and reasoning.

Memory Substrate Architecture

pub struct MemorySubstrate {
    conn: Arc<Mutex<Connection>>,         // Shared SQLite connection
    structured: StructuredStore,
    semantic: SemanticStore,
    knowledge: KnowledgeStore,
    sessions: SessionStore,
    consolidation: ConsolidationEngine,   // Decay + compaction
    usage: UsageStore,                    // Token usage & cost tracking
}
All stores share a single SQLite connection with:
  • WAL mode (Write-Ahead Logging) for concurrent reads
  • 5-second busy timeout to handle contention
  • Foreign key constraints for referential integrity
  • Automatic migrations on first open
1

Opening the Substrate

let substrate = MemorySubstrate::open(
    Path::new("~/.openfang/memory.db"),
    0.95  // Decay rate for consolidation (95% retention per day)
)?;
This creates the database if it doesn’t exist and runs all pending migrations.
2

Schema Initialization

Tables are created for:
  • agents: Agent manifests
  • sessions: Message history per agent
  • kv_store: Key-value pairs
  • memory_fragments: Semantic memory with optional embeddings
  • entities: Knowledge graph nodes
  • relations: Knowledge graph edges
  • usage_log: Token usage and cost tracking
3

Ready for Use

The substrate is wrapped in Arc<MemorySubstrate> and passed to the kernel and runtime.

Layer 1: Structured Store

The structured store is a simple key-value store scoped by agent ID:
// Write
memory.set(agent_id, "last_report_date", json!("2025-03-06")).await?;

// Read
let value = memory.get(agent_id, "last_report_date").await?;
assert_eq!(value, Some(json!("2025-03-06")));

// List all keys for an agent
let all_kv = memory.list_kv(agent_id).await?;

// Delete
memory.delete(agent_id, "last_report_date").await?;
  • Agent state: Counters, flags, timestamps
  • Hand metrics: Dashboard metrics (e.g., reports_count)
  • Configuration overrides: Per-agent settings
  • Cached results: Expensive computations that don’t need semantic recall
CREATE TABLE kv_store (
    agent_id TEXT NOT NULL,
    key TEXT NOT NULL,
    value TEXT NOT NULL,  -- JSON serialized
    updated_at TEXT NOT NULL,
    PRIMARY KEY (agent_id, key)
);

Layer 2: Semantic Store

The semantic store holds memory fragments — snippets of text with optional vector embeddings for similarity search.

Storing Memories

memory.store(
    agent_id,
    "The user prefers concise explanations without jargon.",
    MemorySource::UserPreference
).await?;

memory.store(
    agent_id,
    "Company XYZ launched a new product on 2025-03-01.",
    MemorySource::WebSearch
).await?;
Memory sources tag where the information came from:
pub enum MemorySource {
    UserMessage,       // Extracted from user input
    AgentResponse,     // Generated by the agent
    ToolResult,        // Returned by a tool call
    WebSearch,         // Fetched from the web
    UserPreference,    // Explicit user settings
    SystemNote,        // Internal annotations
}

Recalling Memories

Uses SQLite full-text search (FTS5):
let memories = memory.recall(
    "product launch",
    5,  // Top 5 matches
    Some(MemoryFilter {
        agent_id: Some(agent_id),
        source: Some(MemorySource::WebSearch),
        ..Default::default()
    })
).await?;
This returns fragments ranked by:
  1. Keyword match score (BM25)
  2. Recency (newer memories rank higher)
  3. Access frequency (decay-based score)
If an embedding driver is available, OpenFang uses cosine similarity for more accurate recall:
// Embed the query
let query_embedding = embedding_driver.embed_one("product launch").await?;

// Recall with vector
let memories = memory.recall_with_embedding_async(
    "product launch",
    5,
    Some(MemoryFilter { agent_id: Some(agent_id), .. }),
    Some(&query_embedding)
).await?;
This computes cosine similarity between the query vector and all stored fragment vectors, returning the top K matches.
CREATE TABLE memory_fragments (
    id TEXT PRIMARY KEY,
    agent_id TEXT NOT NULL,
    content TEXT NOT NULL,
    embedding BLOB,              -- Optional: serialized f32 vector
    source TEXT NOT NULL,        -- MemorySource enum
    importance REAL DEFAULT 1.0,
    access_count INTEGER DEFAULT 0,
    created_at TEXT NOT NULL,
    last_accessed_at TEXT NOT NULL,
    FOREIGN KEY (agent_id) REFERENCES agents(id) ON DELETE CASCADE
);

CREATE VIRTUAL TABLE memory_fts USING fts5(content, agent_id);

Vector Embeddings

OpenFang supports pluggable embedding drivers:
pub trait EmbeddingDriver: Send + Sync {
    async fn embed_one(&self, text: &str) -> Result<Vec<f32>>;
    async fn embed_batch(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>>;
    fn dimensions(&self) -> usize;  // e.g., 384 for MiniLM, 1536 for OpenAI
}
Built-in drivers:
  • OpenAI (text-embedding-3-small, 1536 dims)
  • Voyage AI (voyage-2, 1024 dims)
  • Cohere (embed-english-v3.0, 1024 dims)
  • Local Transformers (via rust-bert crate, offline)
Configure in openfang.toml:
[embedding]
provider = "openai"
model = "text-embedding-3-small"
api_key_env = "OPENAI_API_KEY"
Embedding generation is not required — OpenFang falls back to full-text search (SQLite FTS5) if no driver is configured. Embeddings improve recall quality but add latency and cost.

Layer 3: Knowledge Graph

The knowledge store represents entities (nodes) and relations (edges) extracted from agent conversations.

Entities

pub struct Entity {
    pub id: String,              // UUID
    pub agent_id: AgentId,
    pub entity_type: String,     // "Person", "Company", "Product", etc.
    pub name: String,            // Canonical name
    pub attributes: HashMap<String, Value>,  // Arbitrary properties
    pub created_at: DateTime<Utc>,
}
Example:
memory.add_entity(Entity {
    entity_type: "Company".to_string(),
    name: "OpenAI".to_string(),
    attributes: hashmap!{
        "founded" => json!("2015"),
        "ceo" => json!("Sam Altman"),
        "industry" => json!("AI Research")
    },
    ..Default::default()
}).await?;

Relations

pub struct Relation {
    pub id: String,
    pub agent_id: AgentId,
    pub from_entity_id: String,
    pub relation_type: String,   // "works_for", "owns", "launched", etc.
    pub to_entity_id: String,
    pub properties: HashMap<String, Value>,
    pub created_at: DateTime<Utc>,
}
Example:
memory.add_relation(Relation {
    from_entity_id: person_id,
    relation_type: "works_for".to_string(),
    to_entity_id: company_id,
    properties: hashmap!{
        "role" => json!("CEO"),
        "since" => json!("2019")
    },
    ..Default::default()
}).await?;

Graph Queries

let related = memory.find_related(
    entity_id,
    Some("works_for"),  // Relation type filter
    1  // Max hops
).await?;
This performs a breadth-first search in the knowledge graph.
let matches = memory.match_pattern(GraphPattern {
    from: Some(GraphNodePattern {
        entity_type: Some("Person".to_string()),
        name: None,
    }),
    relation: Some("works_for".to_string()),
    to: Some(GraphNodePattern {
        entity_type: Some("Company".to_string()),
        name: Some("OpenAI".to_string()),
    }),
}).await?;
Returns all (Person) -[works_for]-> (OpenAI) matches.
CREATE TABLE entities (
    id TEXT PRIMARY KEY,
    agent_id TEXT NOT NULL,
    entity_type TEXT NOT NULL,
    name TEXT NOT NULL,
    attributes TEXT,  -- JSON
    created_at TEXT NOT NULL,
    FOREIGN KEY (agent_id) REFERENCES agents(id) ON DELETE CASCADE
);

CREATE TABLE relations (
    id TEXT PRIMARY KEY,
    agent_id TEXT NOT NULL,
    from_entity_id TEXT NOT NULL,
    relation_type TEXT NOT NULL,
    to_entity_id TEXT NOT NULL,
    properties TEXT,  -- JSON
    created_at TEXT NOT NULL,
    FOREIGN KEY (from_entity_id) REFERENCES entities(id) ON DELETE CASCADE,
    FOREIGN KEY (to_entity_id) REFERENCES entities(id) ON DELETE CASCADE
);

Memory Consolidation

Over time, memory fragments accumulate. The consolidation engine runs periodically to:
  1. Decay importance scores using exponential decay
  2. Merge similar fragments to reduce redundancy
  3. Delete low-importance memories below a threshold
  4. Compact embeddings by re-clustering and averaging
let report = memory.consolidate(agent_id, Some(0.3)).await?;

println!("Consolidated: {} fragments", report.fragments_processed);
println!("Deleted: {} low-importance", report.fragments_deleted);
println!("Merged: {} duplicates", report.fragments_merged);
Importance decays exponentially based on time since last access:
let days_since_access = (now - last_accessed).num_days() as f32;
let new_importance = old_importance * decay_rate.powf(days_since_access);
Default decay rate: 0.95 (95% retention per day, 50% after ~14 days)
Two fragments are merged if:
  • Cosine similarity > 0.9 (for vector embeddings)
  • Edit distance < 10% (for text-only)
  • Same agent and source
The merged fragment retains the higher importance score and combines access counts.

Session Management

Every agent has sessions — isolated conversation threads:
pub struct Session {
    pub id: SessionId,
    pub agent_id: AgentId,
    pub messages: Vec<Message>,
    pub created_at: DateTime<Utc>,
    pub updated_at: DateTime<Utc>,
}
Sessions are automatically saved after every agent loop iteration. On the next message, the session is loaded and the conversation continues.
1

Create or Load Session

let session = memory.get_or_create_session(agent_id).await?;
If no session exists, a new one is created with an empty message list.
2

Append Messages

During the agent loop, messages are appended:
session.messages.push(Message {
    role: Role::User,
    content: MessageContent::Text(user_message.to_string()),
    ..Default::default()
});
3

Save Session

memory.save_session(&session).await?;
This updates the database with the new message history.
Sessions are channel-scoped — an agent can have different sessions for Telegram vs Discord. This prevents context leakage across channels.

Usage Tracking

The UsageStore records every LLM call:
pub struct UsageRecord {
    pub agent_id: AgentId,
    pub model: String,
    pub prompt_tokens: u32,
    pub completion_tokens: u32,
    pub total_tokens: u32,
    pub cost_usd: f64,
    pub timestamp: DateTime<Utc>,
}
Query total spend:
let total_cost = memory.usage().total_cost(agent_id).await?;
let total_tokens = memory.usage().total_tokens(agent_id).await?;
Rank agents by cost:
curl http://localhost:4200/api/budget/agents
Output:
[
  {"agent_id": "a1", "name": "researcher", "cost_usd": 12.34, "tokens": 450000},
  {"agent_id": "a2", "name": "coder", "cost_usd": 8.21, "tokens": 320000},
  ...
]

Memory Limits & Auto-Trimming

To prevent unbounded growth:
  • Session trimming: When a session exceeds 20 messages, the oldest are removed (keeping the most recent context)
  • Fragment pruning: During consolidation, fragments with importance < 0.1 are deleted
  • Embedding cache: Only the most recent 10,000 embeddings are cached in memory
SQLite databases can grow large over time. Run periodic consolidation or export/archive old sessions.

Import/Export

let json = memory.export(agent_id, ExportFormat::Json).await?;
std::fs::write("agent_memory.json", json)?;
let json = std::fs::read_to_string("agent_memory.json")?;
let report = memory.import(agent_id, &json, ExportFormat::Json).await?;

println!("Imported {} fragments", report.fragments_imported);
println!("Imported {} entities", report.entities_imported);

Next Steps

Memory API Reference

Full API documentation for memory operations

Agent Lifecycle

See how agents use memory during execution

Knowledge Graphs

Advanced guide to graph queries and entity extraction

Embeddings

Configure and optimize vector-based memory recall