Skip to main content

What is Memory?

Memory enables agents to maintain conversation context across multiple turns. Without memory, each agent.run() call is isolated—the agent has no knowledge of previous interactions. Memory providers implement two core operations:
  • Read - Load prior conversation history before execution
  • Write - Persist the current conversation after execution

Memory Provider Interface

All memory implementations satisfy this contract:
// Source: packages/core/src/types/index.ts:117-127
export interface MemoryProvider {
  readonly type: string
  /** Load prior conversation history to inject before the current run */
  read(options: MemoryReadOptions): Promise<ModelMessage[]>
  /** Persist the messages produced by a completed run */
  write(messages: ModelMessage[], options: MemoryWriteOptions): Promise<void>
  /** Remove all stored memory (optional — not all providers support this) */
  clear?(sessionId?: string): Promise<void>
  /** Return raw entries for inspection/debugging */
  entries?(sessionId?: string): Promise<MemoryEntry[]>
}

Memory Entry Structure

Memory providers store conversation turns as entries with metadata:
// Source: packages/core/src/types/index.ts:68-88
export interface MemoryEntry {
  id: string
  sessionId: string
  messages: ModelMessage[]
  metadata: MemoryMetadata
}

export interface MemoryMetadata {
  createdAt: Date
  /** ISO string of when this memory was last accessed */
  accessedAt?: Date | undefined
  /** Agent that created this memory */
  agentName?: string | undefined
  /** Arbitrary key/value tags for filtering */
  tags?: Record<string, string> | undefined
  /** Token count estimate for this entry */
  tokenCount?: number | undefined
}

Built-in Memory Providers

AgentLIB includes several memory implementations:

BufferMemory

Simple in-memory storage. Best for development and single-process applications.
import { BufferMemory } from '@agentlib/memory'

const memory = new BufferMemory({
  maxMessages: 40, // Keep the last 40 messages
  maxTokens: 8000  // Optional: trim to token budget
})

const agent = createAgent({ name: 'assistant' })
  .memory(memory)

How it Works

// Source: packages/memory/src/buffer.ts:63-77
async read(options: MemoryReadOptions): Promise<ModelMessage[]> {
  const sessionId = options.sessionId ?? 'default'
  let messages = this.store.get(sessionId) ?? []

  // Update access time
  const entry = this.meta.get(sessionId)
  if (entry) entry.metadata.accessedAt = new Date()

  // Apply token budget if configured
  if (this.maxTokens) {
    messages = trimToTokenBudget(messages, this.maxTokens)
  }

  return messages
}
  • Stores messages in a Map<sessionId, ModelMessage[]>
  • Automatically trims to maxMessages on write
  • System messages are filtered out (re-injected by agent)
  • Persists for the lifetime of the process
Source: packages/memory/src/buffer.ts:49-126

SlidingWindowMemory

Token-aware in-memory storage. Tracks conversation turns and enforces token budgets.
import { SlidingWindowMemory } from '@agentlib/memory'

const memory = new SlidingWindowMemory({
  maxTokens: 4000, // Token budget for retrieved history
  maxTurns: 30     // Max conversation turns to retain
})

Conversation Turns

Instead of counting raw messages, SlidingWindowMemory groups them into logical turns:
// Source: packages/memory/src/sliding-window.ts:131-149
function groupIntoTurns(messages: ModelMessage[]): ConversationTurn[] {
  const turns: ConversationTurn[] = []
  let current: ModelMessage[] = []

  for (const msg of messages) {
    current.push(msg)

    // A turn ends when the assistant finishes (no pending tool calls)
    if (msg.role === 'assistant' && !msg.toolCalls?.length) {
      turns.push({ messages: current })
      current = []
    }
  }

  // Flush any trailing messages (e.g. incomplete tool call sequences)
  if (current.length) turns.push({ messages: current })

  return turns
}
A turn is:
  • User message
  • Assistant response (with optional tool calls)
  • All tool result messages
  • Final assistant message (if tools were called)
This ensures tool call sequences aren’t split across memory boundaries.

Token Trimming

When maxTokens is set, older messages are dropped to stay within budget:
import { trimToTokenBudget } from '@agentlib/core'

// Removes oldest messages until total tokens ≤ maxTokens
const trimmed = trimToTokenBudget(messages, 4000)
Source: packages/memory/src/sliding-window.ts:52-67

CompositeMemory

Combine multiple memory strategies. Useful for:
  • Layered retrieval (e.g., recent + semantic)
  • Fallback strategies
  • Multi-source aggregation
import { CompositeMemory, BufferMemory } from '@agentlib/memory'
import { VectorMemory } from './custom-vector-memory'

const memory = new CompositeMemory({
  providers: [
    new BufferMemory({ maxMessages: 20 }),      // Recent conversation
    new VectorMemory({ topK: 5, threshold: 0.7 }) // Relevant past context
  ],
  mergeStrategy: 'interleave' // or 'concat' or 'priority'
})
Source: packages/memory/src/composite.ts

SummarizingMemory

Automatically summarize old conversation history. Keeps recent messages verbatim and compresses older ones.
import { SummarizingMemory } from '@agentlib/memory'
import { openai } from '@agentlib/providers/openai'

const memory = new SummarizingMemory({
  recentMessages: 10,     // Keep last 10 messages verbatim
  summaryModel: openai({ model: 'gpt-3.5-turbo' }),
  summarizeEvery: 20,     // Summarize every 20 messages
  maxSummaryTokens: 500   // Budget for summary
})
Source: packages/memory/src/summarizing.ts

Memory Lifecycle

Memory integrates into the agent execution flow at two points:

1. Before Execution (Read)

// Source: packages/core/src/agent/agent.ts:123-127
if (ctx.memory) {
  await this._emitter.emit('memory:read', { sessionId: ctx.sessionId })
  const history = await ctx.memory.read({ sessionId: ctx.sessionId })
  ctx.state.messages = [...history, ...ctx.state.messages]
}
  1. Check if memory provider is configured
  2. Emit memory:read event
  3. Call memory.read() with the session ID
  4. Prepend retrieved history to ctx.state.messages
  5. Continue execution with full context

2. After Execution (Write)

// Source: packages/core/src/agent/agent.ts:138-143
if (ctx.memory) {
  await this._emitter.emit('memory:write', { sessionId: ctx.sessionId })
  await ctx.memory.write(ctx.state.messages, {
    sessionId: ctx.sessionId,
    agentName: this.config.name,
  })
}
  1. Check if memory provider is configured
  2. Emit memory:write event
  3. Call memory.write() with:
    • All messages from the run (including tool calls/results)
    • Session ID
    • Agent name
  4. Provider persists the conversation

Sessions

Sessions are the primary memory isolation boundary. Each session maintains an independent conversation history.

Session ID Strategies

// One conversation per user
const sessionId = `user-${userId}`

await agent.run({ input: 'Hello', sessionId })
await agent.run({ input: 'Follow-up question', sessionId })

Read Options

Memory providers support flexible retrieval:
// Source: packages/core/src/types/index.ts:93-102
export interface MemoryReadOptions {
  /** Max number of messages to return (provider may trim older ones) */
  limit?: number | undefined
  /** Only return messages from this session */
  sessionId?: string | undefined
  /** Semantic query for vector-based providers */
  query?: string | undefined
  /** Filter by metadata tags */
  tags?: Record<string, string> | undefined
}

Examples

// Limit to 10 most recent messages
const recent = await memory.read({ sessionId: 'user-123', limit: 10 })

// Semantic search (for vector memory providers)
const relevant = await memory.read({
  sessionId: 'user-123',
  query: 'previous discussions about pricing'
})

// Filter by tags
const tagged = await memory.read({
  sessionId: 'user-123',
  tags: { topic: 'billing', priority: 'high' }
})

Write Options

When persisting memory, you can attach metadata:
// Source: packages/core/src/types/index.ts:107-111
export interface MemoryWriteOptions {
  sessionId?: string | undefined
  tags?: Record<string, string> | undefined
  agentName?: string | undefined
}

Example: Tagged Memories

const agent = createAgent({ name: 'support-agent' })
  .memory(memory)

agent.on('run:end', async ({ state }) => {
  // Agent automatically writes, but you can add custom tags via middleware
})

// Or write manually
await memory.write(messages, {
  sessionId: 'user-123',
  agentName: 'support-agent',
  tags: {
    resolved: 'true',
    category: 'billing',
    sentiment: 'positive'
  }
})

Token Management

AgentLIB provides utilities for token estimation and trimming:
import {
  estimateTokens,
  estimateMessagesTokens,
  trimToTokenBudget
} from '@agentlib/core'

// Estimate tokens in a string
const tokens = estimateTokens('Hello, world!') // ~4 tokens

// Estimate tokens across all messages
const total = estimateMessagesTokens(messages) // Sum of all message tokens

// Trim messages to fit budget (removes oldest first)
const trimmed = trimToTokenBudget(messages, 4000)
Source: packages/core/src/memory/tokens.ts
Token estimation uses a simple heuristic (chars / 4). For production use with strict token limits, integrate a proper tokenizer like tiktoken.

Custom Memory Providers

You can implement custom memory backends for databases, vector stores, Redis, etc.

Example: Redis Memory

import { MemoryProvider, MemoryReadOptions, MemoryWriteOptions, ModelMessage } from '@agentlib/core'
import Redis from 'ioredis'

class RedisMemory implements MemoryProvider {
  readonly type = 'redis'
  private redis: Redis

  constructor(redisUrl: string) {
    this.redis = new Redis(redisUrl)
  }

  async read(options: MemoryReadOptions): Promise<ModelMessage[]> {
    const sessionId = options.sessionId ?? 'default'
    const key = `memory:${sessionId}`
    
    const data = await this.redis.get(key)
    if (!data) return []
    
    const messages = JSON.parse(data) as ModelMessage[]
    
    // Apply limit if specified
    if (options.limit) {
      return messages.slice(-options.limit)
    }
    
    return messages
  }

  async write(messages: ModelMessage[], options: MemoryWriteOptions): Promise<void> {
    const sessionId = options.sessionId ?? 'default'
    const key = `memory:${sessionId}`
    
    // Filter out system messages
    const toStore = messages.filter(m => m.role !== 'system')
    
    await this.redis.set(key, JSON.stringify(toStore))
    
    // Optional: set TTL
    await this.redis.expire(key, 60 * 60 * 24 * 7) // 7 days
  }

  async clear(sessionId?: string): Promise<void> {
    if (sessionId) {
      await this.redis.del(`memory:${sessionId}`)
    } else {
      // Clear all memory keys
      const keys = await this.redis.keys('memory:*')
      if (keys.length) await this.redis.del(...keys)
    }
  }
}

// Usage
const memory = new RedisMemory('redis://localhost:6379')
agent.memory(memory)

Memory Best Practices

System messages are not stored. Memory providers filter out role: 'system' messages because they’re re-injected by the agent on every run. Only store user/assistant/tool messages.Source: packages/memory/src/buffer.ts:83
  1. Use appropriate memory for your scale
    • Development → BufferMemory
    • Production single-server → SlidingWindowMemory
    • Production distributed → Custom (Redis, PostgreSQL, etc.)
  2. Set token budgets
    • Models have context limits (e.g., GPT-4: 128k tokens)
    • Reserve tokens for system prompt, user input, and output
    • Set maxTokens to 50-70% of model limit
  3. Use consistent session IDs
    // Good: deterministic session IDs
    const sessionId = `user-${userId}`
    
    // Bad: random IDs = no memory continuity
    const sessionId = crypto.randomUUID()
    
  4. Clear stale sessions
    // Periodic cleanup
    setInterval(async () => {
      const entries = await memory.entries()
      const stale = entries.filter(e => {
        const age = Date.now() - e.metadata.accessedAt.getTime()
        return age > 7 * 24 * 60 * 60 * 1000 // 7 days
      })
      for (const entry of stale) {
        await memory.clear(entry.sessionId)
      }
    }, 60 * 60 * 1000) // Every hour
    
  5. Tag important conversations
    await memory.write(messages, {
      sessionId,
      tags: {
        resolved: 'true',
        feedback: 'positive',
        category: 'technical-support'
      }
    })
    

Next Steps

  • Agents - Learn about session management in agents
  • Reasoning - Understand how reasoning engines use memory
  • Events - Listen to memory read/write events

Build docs developers (and LLMs) love