Overview
The chain summarization system manages conversation context growth by selectively summarizing older messages while maintaining conversation coherence. This is critical for preventing token limits from being exceeded during long penetration testing sessions.How It Works
The algorithm operates on a structured representation of conversation chains (ChainAST) that preserves message types including tool calls and their responses. All summarization operations maintain critical conversation flow while reducing context size.Summarization Strategies
PentAGI uses three types of summarization:1. Section Summarization
Ensures all sections except the last N ones consist of a header and a single body pair. This reduces the number of message pairs while preserving critical context.2. Last Section Rotation
Manages size of the last (active) section by summarizing oldest pairs when size limits are exceeded. The most recent body pair is always preserved to maintain reasoning signatures required by providers like Gemini and Anthropic.3. QA Pair Summarization
Creates a summary section containing essential question-answer exchanges when enabled. This focuses on preserving the core interaction patterns while reducing token usage.Configuration Options
Global Summarizer Configuration
| Parameter | Environment Variable | Default | Description |
|---|---|---|---|
| Preserve Last | SUMMARIZER_PRESERVE_LAST | true | Whether to keep all messages in the last section intact |
| Use QA Pairs | SUMMARIZER_USE_QA | true | Whether to use QA pair summarization strategy |
| Summarize Human in QA | SUMMARIZER_SUM_MSG_HUMAN_IN_QA | false | Whether to summarize human messages in QA pairs |
| Last Section Size | SUMMARIZER_LAST_SEC_BYTES | 51200 | Maximum byte size for last section (50KB) |
| Max Body Pair Size | SUMMARIZER_MAX_BP_BYTES | 16384 | Maximum byte size for a single body pair (16KB) |
| Max QA Sections | SUMMARIZER_MAX_QA_SECTIONS | 10 | Maximum QA pair sections to preserve |
| Max QA Size | SUMMARIZER_MAX_QA_BYTES | 65536 | Maximum byte size for QA pair sections (64KB) |
| Keep QA Sections | SUMMARIZER_KEEP_QA_SECTIONS | 1 | Number of recent QA sections to keep without summarization |
Assistant Summarizer Configuration
Assistant instances use customized settings for enhanced context retention:| Parameter | Environment Variable | Default | Description |
|---|---|---|---|
| Preserve Last | ASSISTANT_SUMMARIZER_PRESERVE_LAST | true | Whether to preserve all messages in the assistant’s last section |
| Last Section Size | ASSISTANT_SUMMARIZER_LAST_SEC_BYTES | 76800 | Maximum byte size for assistant’s last section (75KB) |
| Max Body Pair Size | ASSISTANT_SUMMARIZER_MAX_BP_BYTES | 16384 | Maximum byte size for a single body pair in assistant context (16KB) |
| Max QA Sections | ASSISTANT_SUMMARIZER_MAX_QA_SECTIONS | 7 | Maximum QA sections to preserve in assistant context |
| Max QA Size | ASSISTANT_SUMMARIZER_MAX_QA_BYTES | 76800 | Maximum byte size for assistant’s QA sections (75KB) |
| Keep QA Sections | ASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS | 3 | Number of recent QA sections to preserve without summarization |
The assistant summarizer provides more memory for context retention compared to global settings, preserving more recent conversation history for better task continuity.
Environment Configuration
Add these settings to your.env file:
Key Features
Size-Aware Processing
Tracks byte size of all content to make optimal retention decisions
Reasoning Preservation
Maintains reasoning signatures required by Gemini and Anthropic providers
Concurrent Processing
Uses goroutines for efficient parallel summarization of sections
Idempotent Operation
Multiple consecutive calls do not modify already summarized content
Best Practices
Adjust for Long Sessions
Adjust for Long Sessions
For extended penetration testing sessions, increase
SUMMARIZER_LAST_SEC_BYTES to 76800 (75KB) to preserve more recent context.Preserve Tool Calls
Preserve Tool Calls
Never reduce
SUMMARIZER_KEEP_QA_SECTIONS below 1 to ensure tool call context is maintained for multi-step operations.Balance Token Usage
Balance Token Usage
Monitor your LLM provider’s token limits and adjust
SUMMARIZER_MAX_QA_BYTES accordingly. Most providers support 128K+ tokens.Assistant Configuration
Assistant Configuration
Use separate assistant summarizer settings for interactive workflows that require more context retention than automated agents.
Reasoning Signature Handling
When summarizing content that originally contained reasoning, the algorithm:- Detects ToolCall Reasoning: Checks if messages contain reasoning in ToolCall parts
- Extracts TextContent Reasoning: Preserves reasoning TextContent for providers like Kimi/Moonshot
- Adds Fake Signatures: Adds a fake signature to summarized ToolCall if reasoning was present
- Preserves Reasoning Message: Prepends reasoning TextContent before ToolCall
- Provider Compatibility: Ensures summarized chain remains compatible with all provider APIs
Performance Impact
- Token Efficiency: Summarization reduces overall token count by 40-60% in long conversations
- Latency: Adds 1-3 seconds for summarization calls to LLM
- Memory: Minimal memory overhead with concurrent processing
- Coherence: Maintains conversation quality with intelligent section preservation
Related Resources
Context Management
Learn about managing LLM context windows and token limits
Performance Tuning
Optimize PentAGI performance and resource management