SummarizingMemory Example

SummarizingMemory uses a dedicated LLM to compress old conversation history into concise summaries, allowing you to maintain context from long conversations while staying within token limits.

What SummarizingMemory Does

SummarizingMemory:

Uses a dedicated (typically cheaper/faster) LLM for summarization
Compresses old conversation history into summaries
Maintains a “sliding window” of recent messages
Automatically triggers summarization when token threshold is reached
Preserves essential context while reducing token usage
Perfect for very long conversations where all context matters

Complete Working Example

import 'dotenv/config'

import { createAgent } from '@agentlib/core'
import { openai } from '@agentlib/openai'
import { SummarizingMemory } from '@agentlib/memory'
import { createLogger } from '@agentlib/logger'

/**
 * Summarizing Memory Example
 * 
 * Demonstrates how to use a dedicated (usually cheaper/faster) LLM 
 * to compress old conversation history into a concise summary.
 */

async function main() {
    // 1. Setup the dedicated model for summarization
    const summarizerModel = openai({
        apiKey: process.env['OPENAI_API_KEY'] ?? '',
        model: process.env['OPENAI_MODEL_SUMMARIZER'] ?? '',
        baseURL: process.env['OPENAI_BASE_URL'] ?? ''
    })

    // 2. Setup Summarizing Memory
    // We trigger summarization after 200 tokens to see it in action
    const memory = new SummarizingMemory({
        model: summarizerModel,
        activeWindowTokens: 250,
        summaryPrompt: 'Summarize the user profile and preferences accurately.'
    })

    const agent = createAgent({
        name: 'summarizer-agent',
        systemPrompt: 'You are a personalized assistant. Help the user plan a trip.',
    })
        .provider(openai({
            apiKey: process.env['OPENAI_API_KEY'] ?? '',
            model: process.env['OPENAI_MODEL'] ?? '',
            baseURL: process.env['OPENAI_BASE_URL'] ?? ''
        }))
        .memory(memory)
        .use(createLogger({ level: 'info' }))

    const sessionId = 'travel-planner'

    console.log('--- Summarizing Memory Demo (Compression at 250 tokens) ---\n')

    const interaction = [
        "I am planning a trip to Japan next April. I love sushi and nature.",
        "I want to stay for 2 weeks. My budget is around $5000.",
        "I prefer boutique hotels over large chains.",
        "I also want to visit some hidden gems, not just tourist spots.",
        "Tell me what you know about my travel preferences so far."
    ]

    for (const input of interaction) {
        console.log(`> User: ${input}`)
        const res = await agent.run({ input, sessionId })
        console.log(`Agent: ${res.output}\n`)

        // Check if a summary has been generated yet
        const currentSummary = memory.getSummary(sessionId)
        if (currentSummary) {
            console.log('-- CURRENT COMPRESSED SUMMARY --')
            console.log(currentSummary)
            console.log('--------------------------------\n')
        }
    }
}

main().catch(console.error)

Key Configuration

Dedicated Summarizer Model

Use a separate, often cheaper model for summarization:

const summarizerModel = openai({
    apiKey: process.env['OPENAI_API_KEY'] ?? '',
    model: 'gpt-4o-mini', // Cheaper model for summarization
    baseURL: process.env['OPENAI_BASE_URL'] ?? ''
})

Active Window Tokens

Defines when to trigger summarization:

const memory = new SummarizingMemory({
    model: summarizerModel,
    activeWindowTokens: 250,  // Summarize when exceeding this
    summaryPrompt: 'Summarize the user profile and preferences accurately.'
})

Custom Summary Prompt

Control how the summarization is performed:

summaryPrompt: 'Summarize the user profile and preferences accurately.'

How It Works

Recent Messages: Keeps recent messages in the “active window” (up to activeWindowTokens)
Automatic Trigger: When the active window exceeds the token limit, triggers summarization
Compression: Uses the dedicated model to summarize older messages
Summary Storage: Stores the summary and removes the original messages
Context Composition: Provides both summary and recent messages to the agent

Inspecting Summaries

You can access the current summary at any time:

const currentSummary = memory.getSummary(sessionId)
if (currentSummary) {
    console.log('Current summary:', currentSummary)
}

When to Use SummarizingMemory

Very long conversations where all context is important
Customer support scenarios with extended interaction history
Applications that need to remember details from the entire conversation
When token costs are a concern for long sessions
Scenarios where losing old context would degrade user experience

Cost Optimization

SummarizingMemory helps reduce costs by:

Using a cheaper model (e.g., GPT-4o-mini) for summarization
Compressing hundreds of tokens into a few dozen
Allowing the main agent to use fewer tokens per request
Maintaining context without sending entire conversation history

Comparison with Other Strategies

Feature	BufferMemory	SlidingWindowMemory	SummarizingMemory
Old context	Dropped	Dropped	Summarized
Token efficiency	Low	Medium	High
Context retention	Poor (long convos)	Poor (long convos)	Excellent
Complexity	Simple	Medium	Advanced
Additional cost	None	None	Summarization LLM
Best for	Short chats	Medium chats	Long conversations

Basic Usage

Memory Examples

Reasoning Examples

Advanced

SummarizingMemory Example

SummarizingMemory Example

What SummarizingMemory Does

Complete Working Example

Key Configuration

Dedicated Summarizer Model

Active Window Tokens

Custom Summary Prompt

How It Works

Inspecting Summaries

When to Use SummarizingMemory

Cost Optimization

Comparison with Other Strategies

Build docs developers (and LLMs) love

Basic Usage

Memory Examples

Reasoning Examples

Advanced

​SummarizingMemory Example

​What SummarizingMemory Does

​Complete Working Example

​Key Configuration

​Dedicated Summarizer Model

​Active Window Tokens

​Custom Summary Prompt

​How It Works

​Inspecting Summaries

​When to Use SummarizingMemory

​Cost Optimization

​Comparison with Other Strategies

Build docs developers (and LLMs) love

SummarizingMemory Example

What SummarizingMemory Does

Complete Working Example

Key Configuration

Dedicated Summarizer Model

Active Window Tokens

Custom Summary Prompt

How It Works

Inspecting Summaries

When to Use SummarizingMemory

Cost Optimization

Comparison with Other Strategies