Memory enables agents to maintain conversation context across multiple turns. Without memory, each agent.run() call is isolated—the agent has no knowledge of previous interactions.Memory providers implement two core operations:
Read - Load prior conversation history before execution
Write - Persist the current conversation after execution
// Source: packages/core/src/types/index.ts:117-127export interface MemoryProvider { readonly type: string /** Load prior conversation history to inject before the current run */ read(options: MemoryReadOptions): Promise<ModelMessage[]> /** Persist the messages produced by a completed run */ write(messages: ModelMessage[], options: MemoryWriteOptions): Promise<void> /** Remove all stored memory (optional — not all providers support this) */ clear?(sessionId?: string): Promise<void> /** Return raw entries for inspection/debugging */ entries?(sessionId?: string): Promise<MemoryEntry[]>}
Memory providers store conversation turns as entries with metadata:
// Source: packages/core/src/types/index.ts:68-88export interface MemoryEntry { id: string sessionId: string messages: ModelMessage[] metadata: MemoryMetadata}export interface MemoryMetadata { createdAt: Date /** ISO string of when this memory was last accessed */ accessedAt?: Date | undefined /** Agent that created this memory */ agentName?: string | undefined /** Arbitrary key/value tags for filtering */ tags?: Record<string, string> | undefined /** Token count estimate for this entry */ tokenCount?: number | undefined}
Token-aware in-memory storage. Tracks conversation turns and enforces token budgets.
import { SlidingWindowMemory } from '@agentlib/memory'const memory = new SlidingWindowMemory({ maxTokens: 4000, // Token budget for retrieved history maxTurns: 30 // Max conversation turns to retain})
// One conversation per userconst sessionId = `user-${userId}`await agent.run({ input: 'Hello', sessionId })await agent.run({ input: 'Follow-up question', sessionId })
// Multiple conversations per userconst sessionId = `user-${userId}-thread-${threadId}`await agent.run({ input: 'Start new topic', sessionId })
// No sessionId = random UUID per run = no memory between callsawait agent.run({ input: 'One-off question' })
// Shared memory across all users (use with caution!)const sessionId = 'global-knowledge-base'await agent.run({ input: 'Add to shared context', sessionId })
// Source: packages/core/src/types/index.ts:93-102export interface MemoryReadOptions { /** Max number of messages to return (provider may trim older ones) */ limit?: number | undefined /** Only return messages from this session */ sessionId?: string | undefined /** Semantic query for vector-based providers */ query?: string | undefined /** Filter by metadata tags */ tags?: Record<string, string> | undefined}
AgentLIB provides utilities for token estimation and trimming:
import { estimateTokens, estimateMessagesTokens, trimToTokenBudget} from '@agentlib/core'// Estimate tokens in a stringconst tokens = estimateTokens('Hello, world!') // ~4 tokens// Estimate tokens across all messagesconst total = estimateMessagesTokens(messages) // Sum of all message tokens// Trim messages to fit budget (removes oldest first)const trimmed = trimToTokenBudget(messages, 4000)
Source: packages/core/src/memory/tokens.ts
Token estimation uses a simple heuristic (chars / 4). For production use with strict token limits, integrate a proper tokenizer like tiktoken.
System messages are not stored. Memory providers filter out role: 'system' messages because they’re re-injected by the agent on every run. Only store user/assistant/tool messages.Source: packages/memory/src/buffer.ts:83
Use appropriate memory for your scale
Development → BufferMemory
Production single-server → SlidingWindowMemory
Production distributed → Custom (Redis, PostgreSQL, etc.)
Set token budgets
Models have context limits (e.g., GPT-4: 128k tokens)
Reserve tokens for system prompt, user input, and output
Set maxTokens to 50-70% of model limit
Use consistent session IDs
// Good: deterministic session IDsconst sessionId = `user-${userId}`// Bad: random IDs = no memory continuityconst sessionId = crypto.randomUUID()
Clear stale sessions
// Periodic cleanupsetInterval(async () => { const entries = await memory.entries() const stale = entries.filter(e => { const age = Date.now() - e.metadata.accessedAt.getTime() return age > 7 * 24 * 60 * 60 * 1000 // 7 days }) for (const entry of stale) { await memory.clear(entry.sessionId) }}, 60 * 60 * 1000) // Every hour