Skip to main content
Memory providers enable agents to maintain context across multiple interactions. AgentLIB offers several memory strategies, each optimized for different use cases.

Overview

Memory providers implement two core operations:
  • read(): Load conversation history before a run
  • write(): Persist messages after a run completes
All messages are scoped to a sessionId, allowing multiple independent conversations.

Buffer Memory

The simplest memory provider keeps recent messages in memory.
import { createAgent } from '@agentlib/core'
import { BufferMemory } from '@agentlib/memory'
import { openai } from '@agentlib/openai'

const memory = new BufferMemory({
  maxMessages: 10  // Keep last 10 messages
})

const agent = createAgent({ name: 'chat-agent' })
  .provider(openai({ apiKey: process.env.OPENAI_API_KEY }))
  .memory(memory)

const sessionId = 'user-123'

// First conversation
await agent.run({
  input: 'Hi! My name is Alice.',
  sessionId,
})

// Second conversation - remembers previous context
await agent.run({
  input: 'What is my name?',
  sessionId,
})
// Agent responds: "Your name is Alice."
BufferMemory is ideal for development and short conversations. For production, use persistent memory providers.

Sliding Window Memory

Automatically trims messages to fit within a token budget while keeping recent context.
1

Configure Sliding Window

import { SlidingWindowMemory } from '@agentlib/memory'

const memory = new SlidingWindowMemory({
  maxTokens: 300,  // Token budget
  maxTurns: 5,     // Maximum conversation turns
})
2

Attach to Agent

const agent = createAgent({
  name: 'sliding-agent',
  systemPrompt: 'You are a helpful assistant.',
})
  .provider(openai({ apiKey: process.env.OPENAI_API_KEY }))
  .memory(memory)
3

Use Across Multiple Turns

const sessionId = 'session-456'

const turns = [
  "My first favorite fruit is Apple.",
  "My second favorite fruit is Banana.",
  "My third favorite fruit is Cherry.",
  "My fourth favorite fruit is Dragonfruit.",
  "My fifth favorite fruit is Elderberry.",
  "Can you list all the fruits I mentioned?"
]

for (const input of turns) {
  const res = await agent.run({ input, sessionId })
  console.log(res.output)
}
4

Inspect Memory Stats

const stats = memory.stats(sessionId)
console.log(`Current turns: ${stats?.turns}`)
console.log(`Estimated tokens: ${stats?.estimatedTokens} / 300`)
Sliding window is perfect for long-running conversations where you want to keep recent context without exceeding model limits.

Summarizing Memory

Compresses old conversation history into a concise summary using a dedicated LLM.
import { SummarizingMemory } from '@agentlib/memory'
import { openai } from '@agentlib/openai'

// Use a cheaper/faster model for summarization
const summarizerModel = openai({
  apiKey: process.env.OPENAI_API_KEY,
  model: 'gpt-4o-mini',
})

const memory = new SummarizingMemory({
  model: summarizerModel,
  activeWindowTokens: 250,  // Trigger summarization at 250 tokens
  summaryPrompt: 'Summarize the user profile and preferences accurately.',
})

const agent = createAgent({
  name: 'summarizer-agent',
  systemPrompt: 'You are a personalized travel assistant.',
})
  .provider(openai({
    apiKey: process.env.OPENAI_API_KEY,
    model: 'gpt-4o',
  }))
  .memory(memory)

const sessionId = 'travel-planner'

const interaction = [
  "I am planning a trip to Japan next April. I love sushi and nature.",
  "I want to stay for 2 weeks. My budget is around $5000.",
  "I prefer boutique hotels over large chains.",
  "I also want to visit some hidden gems, not just tourist spots.",
  "Tell me what you know about my travel preferences so far."
]

for (const input of interaction) {
  const res = await agent.run({ input, sessionId })
  console.log(res.output)

  // Check if a summary has been generated
  const currentSummary = memory.getSummary(sessionId)
  if (currentSummary) {
    console.log('Summary:', currentSummary)
  }
}

How It Works

  1. Recent messages are kept in an “active window”
  2. When the active window exceeds activeWindowTokens, older messages are summarized
  3. The summary is prepended to the active window in future runs
  4. This keeps context while reducing token usage
Summarization uses additional LLM calls. Use a cheaper model for the summarizer to control costs.

Composite Memory

Combines multiple memory providers for layered storage (e.g., fast cache + persistent storage).
import { CompositeMemory, BufferMemory } from '@agentlib/memory'
import { RedisMemory } from '@agentlib/redis' // hypothetical

const memory = new CompositeMemory({
  providers: [
    new BufferMemory({ maxMessages: 20 }),    // Fast L1 cache
    new RedisMemory({ url: process.env.REDIS_URL }), // Persistent L2
  ],
  readStrategy: 'first-hit', // Return from first provider with data
})

const agent = createAgent({ name: 'multi-tier-agent' })
  .provider(model)
  .memory(memory)

Read Strategies

  • first-hit (default): Return the first non-empty result
  • merge: Combine results from all providers and deduplicate

Behavior

  • Reads: Query providers in order until one returns data
  • Writes: Fan out to ALL providers in parallel
Use CompositeMemory to combine fast local memory with persistent remote storage for optimal performance.

Working with Sessions

All memory providers use session IDs to scope conversations:
const memory = new BufferMemory({ maxMessages: 10 })
const agent = createAgent({ name: 'agent' })
  .provider(model)
  .memory(memory)

// Conversation for user A
await agent.run({
  input: 'My name is Alice.',
  sessionId: 'user-alice',
})

// Conversation for user B (completely separate)
await agent.run({
  input: 'My name is Bob.',
  sessionId: 'user-bob',
})

// Retrieve Alice's conversation
await agent.run({
  input: 'What is my name?',
  sessionId: 'user-alice',
})
// Agent responds: "Your name is Alice."

Inspecting Memory

All memory providers support inspection:
// Get raw entries for a session
const entries = await memory.entries('user-123')
console.log(`Messages in session: ${entries[0]?.messages.length}`)

// Clear a specific session
await memory.clear('user-123')

// Clear all sessions
await memory.clear()

Memory Events

Monitor memory operations with events:
agent.on('memory:read', ({ sessionId }) => {
  console.log(`Loading memory for session: ${sessionId}`)
})

agent.on('memory:write', ({ sessionId }) => {
  console.log(`Saving memory for session: ${sessionId}`)
})

const result = await agent.run({
  input: 'Hello',
  sessionId: 'user-789',
})

Complete Example

import 'dotenv/config'
import { createAgent } from '@agentlib/core'
import { openai } from '@agentlib/openai'
import { BufferMemory } from '@agentlib/memory'
import { createLogger } from '@agentlib/logger'

const memory = new BufferMemory({
  maxMessages: 10
})

const agent = createAgent({
  name: 'memory-demo-agent',
  systemPrompt: "You are a friendly assistant. Remember the user's name and preferences.",
})
  .provider(openai({
    apiKey: process.env.OPENAI_API_KEY,
    model: 'gpt-4o-mini'
  }))
  .memory(memory)
  .use(createLogger({ level: 'info' }))

const sessionId = 'user-session-123'

console.log('--- Conversation Start ---\n')

// First turn
console.log('> User: Hi! My name is Sammy and I love coding in TypeScript.')
const res1 = await agent.run({
  input: 'Hi! My name is Sammy and I love coding in TypeScript.',
  sessionId
})
console.log(`\nAgent: ${res1.output}\n`)

// Second turn - agent remembers context
console.log('> User: What is my favorite language?')
const res2 = await agent.run({
  input: 'What is my favorite language?',
  sessionId
})
console.log(`\nAgent: ${res2.output}\n`)

// Third turn
console.log('> User: Do you remember my name?')
const res3 = await agent.run({
  input: 'Do you remember my name?',
  sessionId
})
console.log(`\nAgent: ${res3.output}\n`)

// Inspect memory
const entries = await memory.entries(sessionId)
console.log(`Messages in session: ${entries[0]?.messages.length}`)

Choosing the Right Strategy

StrategyBest ForToken ManagementPersistence
BufferMemoryDevelopment, short chatsManual (maxMessages)In-memory only
SlidingWindowMemoryLong conversationsAutomatic trimmingIn-memory only
SummarizingMemoryVery long sessionsIntelligent compressionIn-memory only
CompositeMemoryProduction appsDelegated to providersMulti-tier

Next Steps

Build docs developers (and LLMs) love