SummarizingMemory

Overview

SummarizingMemory compresses old conversation context using a language model. When stored history exceeds activeWindowTokens, older messages are summarized into a single system-style message. Best for: long-running conversations that need to maintain context without exceeding token limits.

Constructor

import { SummarizingMemory } from '@agentlib/memory'
import { openai } from '@agentlib/openai'

const memory = new SummarizingMemory(config)

Configuration

model

ModelProvider

required

The model provider used to generate summaries. Should be a fast/cheap model (e.g. gpt-4o-mini).

activeWindowTokens

number

default:3000

Token budget for the active (non-summarized) window. When exceeded, the oldest messages are compressed into a summary.

summaryMaxTokens

number

default:600

Maximum tokens to allow for the compressed summary itself.

summaryPrompt

string

default:"You are a memory compression assistant..."

Custom prompt used to generate summaries.

Methods

read()

Retrieve conversation history with summary injected.

async read(options: MemoryReadOptions): Promise<ModelMessage[]>

Parameters:

options.sessionId - Session identifier (defaults to 'default')

Returns: Array of messages, with summary (if exists) as a system message followed by active messages. Behavior:

Injects summary as a system message: [Conversation summary so far]\n{summary}
Appends recent active window messages
Updates access timestamp

write()

Persist messages and compress if needed.

async write(messages: ModelMessage[], options: MemoryWriteOptions): Promise<void>

Parameters:

messages - Array of messages to store
options.sessionId - Session identifier (defaults to 'default')
options.agentName - Name of the agent storing the messages
options.tags - Metadata tags

Behavior:

Stores non-system messages in active window
Checks if active window exceeds activeWindowTokens
Triggers compression if budget exceeded
Updates access timestamp

clear()

Remove stored memory.

async clear(sessionId?: string): Promise<void>

Parameters:

sessionId - If provided, clears only that session. Otherwise clears all sessions.

entries()

Retrieve raw memory entries for inspection/debugging.

async entries(sessionId?: string): Promise<MemoryEntry[]>

Parameters:

sessionId - If provided, returns only that session’s entry

Returns: Array of MemoryEntry objects with metadata.

getSummary()

Get the current summary for a session (for debugging/inspection).

getSummary(sessionId: string): string | null

Parameters:

sessionId - Session to inspect

Returns: The summary text, or null if no summary exists.

Usage Examples

Basic Usage

import { Agent } from '@agentlib/core'
import { SummarizingMemory } from '@agentlib/memory'
import { openai } from '@agentlib/openai'

const memory = new SummarizingMemory({
  model: openai({ apiKey: process.env.OPENAI_API_KEY, model: 'gpt-4o-mini' }),
  activeWindowTokens: 3000,
})

const agent = new Agent({
  name: 'assistant',
  memory,
})

Custom Summary Prompt

const memory = new SummarizingMemory({
  model: openai({ model: 'gpt-4o-mini' }),
  summaryPrompt: `You are a technical assistant.
Summarize the conversation focusing on:
- Code changes made
- Technical decisions
- Outstanding issues
Be extremely concise.`,
})

Inspecting Summaries

const memory = new SummarizingMemory({
  model: openai({ model: 'gpt-4o-mini' }),
})

// After a long conversation
const summary = memory.getSummary('user-123')
if (summary) {
  console.log('Current summary:', summary)
}

Production Configuration

const memory = new SummarizingMemory({
  model: openai({
    apiKey: process.env.OPENAI_API_KEY,
    model: 'gpt-4o-mini', // Use cheap model for summaries
  }),
  activeWindowTokens: 4000,  // Keep recent context
  summaryMaxTokens: 800,     // Allow detailed summaries
})

How Compression Works

When the active window exceeds activeWindowTokens:

Split: Messages are split in half
Compress: First half is sent to the model for summarization
Keep: Second half remains as active messages
Merge: If a summary already exists, it’s included in the compression prompt
Store: New summary replaces the old one, active window is trimmed

Example flow:

// Before compression (8000 tokens)
Active: [msg1, msg2, msg3, msg4, msg5, msg6, msg7, msg8]

// After compression (4000 tokens)
Summary: "User asked about X, agent explained Y..."
Active: [msg5, msg6, msg7, msg8]

// Next compression merges summaries
Summary: "Previous: User asked about X... New: Discussed Z..."
Active: [msg7, msg8, msg9, msg10]

Default Summary Prompt

You are a memory compression assistant.
Summarize the following conversation concisely, preserving key facts, decisions, and context.
Output only the summary — no preamble or commentary.

Type Reference

Source: /packages/memory/src/summarizing.ts:12-36

interface SummarizingMemoryConfig {
  model: ModelProvider
  activeWindowTokens?: number  // default: 3000
  summaryMaxTokens?: number    // default: 600
  summaryPrompt?: string
}

Core

Model Providers

Memory

Reasoning

Orchestrator

Logger

Types

SummarizingMemory

Overview

Constructor

Configuration

Methods

read()

write()

clear()

entries()

getSummary()

Usage Examples

Basic Usage

Custom Summary Prompt

Inspecting Summaries

Production Configuration

How Compression Works

Default Summary Prompt

Type Reference

Build docs developers (and LLMs) love

Core

Model Providers

Memory

Reasoning

Orchestrator

Logger

Types

​Overview

​Constructor

​Configuration

​Methods

​read()

​write()

​clear()

​entries()

​getSummary()

​Usage Examples

​Basic Usage

​Custom Summary Prompt

​Inspecting Summaries

​Production Configuration

​How Compression Works

​Default Summary Prompt

​Type Reference

Build docs developers (and LLMs) love

Overview

Constructor

Configuration

Methods

read()

write()

clear()

entries()

getSummary()

Usage Examples

Basic Usage

Custom Summary Prompt

Inspecting Summaries

Production Configuration

How Compression Works

Default Summary Prompt

Type Reference