Skip to main content

Overview

The chain summarization system manages conversation context growth by selectively summarizing older messages while maintaining conversation coherence. This is critical for preventing token limits from being exceeded during long penetration testing sessions.

How It Works

The algorithm operates on a structured representation of conversation chains (ChainAST) that preserves message types including tool calls and their responses. All summarization operations maintain critical conversation flow while reducing context size.

Summarization Strategies

PentAGI uses three types of summarization:

1. Section Summarization

Ensures all sections except the last N ones consist of a header and a single body pair. This reduces the number of message pairs while preserving critical context.

2. Last Section Rotation

Manages size of the last (active) section by summarizing oldest pairs when size limits are exceeded. The most recent body pair is always preserved to maintain reasoning signatures required by providers like Gemini and Anthropic.

3. QA Pair Summarization

Creates a summary section containing essential question-answer exchanges when enabled. This focuses on preserving the core interaction patterns while reducing token usage.

Configuration Options

Global Summarizer Configuration

ParameterEnvironment VariableDefaultDescription
Preserve LastSUMMARIZER_PRESERVE_LASTtrueWhether to keep all messages in the last section intact
Use QA PairsSUMMARIZER_USE_QAtrueWhether to use QA pair summarization strategy
Summarize Human in QASUMMARIZER_SUM_MSG_HUMAN_IN_QAfalseWhether to summarize human messages in QA pairs
Last Section SizeSUMMARIZER_LAST_SEC_BYTES51200Maximum byte size for last section (50KB)
Max Body Pair SizeSUMMARIZER_MAX_BP_BYTES16384Maximum byte size for a single body pair (16KB)
Max QA SectionsSUMMARIZER_MAX_QA_SECTIONS10Maximum QA pair sections to preserve
Max QA SizeSUMMARIZER_MAX_QA_BYTES65536Maximum byte size for QA pair sections (64KB)
Keep QA SectionsSUMMARIZER_KEEP_QA_SECTIONS1Number of recent QA sections to keep without summarization

Assistant Summarizer Configuration

Assistant instances use customized settings for enhanced context retention:
ParameterEnvironment VariableDefaultDescription
Preserve LastASSISTANT_SUMMARIZER_PRESERVE_LASTtrueWhether to preserve all messages in the assistant’s last section
Last Section SizeASSISTANT_SUMMARIZER_LAST_SEC_BYTES76800Maximum byte size for assistant’s last section (75KB)
Max Body Pair SizeASSISTANT_SUMMARIZER_MAX_BP_BYTES16384Maximum byte size for a single body pair in assistant context (16KB)
Max QA SectionsASSISTANT_SUMMARIZER_MAX_QA_SECTIONS7Maximum QA sections to preserve in assistant context
Max QA SizeASSISTANT_SUMMARIZER_MAX_QA_BYTES76800Maximum byte size for assistant’s QA sections (75KB)
Keep QA SectionsASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS3Number of recent QA sections to preserve without summarization
The assistant summarizer provides more memory for context retention compared to global settings, preserving more recent conversation history for better task continuity.

Environment Configuration

Add these settings to your .env file:
# Default values for global summarizer logic
SUMMARIZER_PRESERVE_LAST=true
SUMMARIZER_USE_QA=true
SUMMARIZER_SUM_MSG_HUMAN_IN_QA=false
SUMMARIZER_LAST_SEC_BYTES=51200
SUMMARIZER_MAX_BP_BYTES=16384
SUMMARIZER_MAX_QA_SECTIONS=10
SUMMARIZER_MAX_QA_BYTES=65536
SUMMARIZER_KEEP_QA_SECTIONS=1

# Default values for assistant summarizer logic
ASSISTANT_SUMMARIZER_PRESERVE_LAST=true
ASSISTANT_SUMMARIZER_LAST_SEC_BYTES=76800
ASSISTANT_SUMMARIZER_MAX_BP_BYTES=16384
ASSISTANT_SUMMARIZER_MAX_QA_SECTIONS=7
ASSISTANT_SUMMARIZER_MAX_QA_BYTES=76800
ASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS=3

Key Features

Size-Aware Processing

Tracks byte size of all content to make optimal retention decisions

Reasoning Preservation

Maintains reasoning signatures required by Gemini and Anthropic providers

Concurrent Processing

Uses goroutines for efficient parallel summarization of sections

Idempotent Operation

Multiple consecutive calls do not modify already summarized content

Best Practices

For extended penetration testing sessions, increase SUMMARIZER_LAST_SEC_BYTES to 76800 (75KB) to preserve more recent context.
Never reduce SUMMARIZER_KEEP_QA_SECTIONS below 1 to ensure tool call context is maintained for multi-step operations.
Monitor your LLM provider’s token limits and adjust SUMMARIZER_MAX_QA_BYTES accordingly. Most providers support 128K+ tokens.
Use separate assistant summarizer settings for interactive workflows that require more context retention than automated agents.

Reasoning Signature Handling

When summarizing content that originally contained reasoning, the algorithm:
  1. Detects ToolCall Reasoning: Checks if messages contain reasoning in ToolCall parts
  2. Extracts TextContent Reasoning: Preserves reasoning TextContent for providers like Kimi/Moonshot
  3. Adds Fake Signatures: Adds a fake signature to summarized ToolCall if reasoning was present
  4. Preserves Reasoning Message: Prepends reasoning TextContent before ToolCall
  5. Provider Compatibility: Ensures summarized chain remains compatible with all provider APIs
Provider-Specific Requirements:
  • Gemini: Requires thought_signature for function calls. Fake signatures satisfy API validation.
  • Kimi/Moonshot: Requires reasoning_content in TextContent before ToolCall when thinking is enabled.
  • Anthropic: Extended thinking with cryptographic signatures, automatically removed from previous turns.

Performance Impact

  • Token Efficiency: Summarization reduces overall token count by 40-60% in long conversations
  • Latency: Adds 1-3 seconds for summarization calls to LLM
  • Memory: Minimal memory overhead with concurrent processing
  • Coherence: Maintains conversation quality with intelligent section preservation

Context Management

Learn about managing LLM context windows and token limits

Performance Tuning

Optimize PentAGI performance and resource management

Build docs developers (and LLMs) love