Overview
PentAGI implements a sophisticated memory system that enables agents to learn from past experiences, maintain context across conversations, and retrieve relevant information for decision-making. The system combines vector-based semantic search with graph-based relationship tracking.Memory Architecture
The memory system is organized into three distinct layers:Long-term Memory
Vector Store: PostgreSQL with pgvector extension storing semantic embeddings for similarity-based retrieval. Knowledge Base: Structured domain expertise including vulnerability databases, tool capabilities, and security techniques. Tools Knowledge: Historical patterns of tool usage, success rates, and optimal parameter configurations.Working Memory
Current Context: Active conversation state, recent messages, and immediate task information. Active Goals: Objectives being pursued in the current penetration testing session. System State: Available resources, loaded tools, and environmental constraints.Episodic Memory
Past Actions: Complete history of commands executed, searches performed, and analyses conducted. Action Results: Outputs, success/failure status, and outcome details from past operations. Success Patterns: Learned strategies and techniques that have proven effective in similar scenarios.Vector Storage Implementation
PentAGI uses PostgreSQL with the pgvector extension for efficient semantic search:Storage Schema
Metadata Structure
Each memory entry includes rich metadata for filtering and context:Memory Operations
Storing Memories
When agents perform actions, results are automatically stored as memory entries:Retrieving Memories
Agents query memory using natural language questions that are converted to vectors:Similarity Threshold
The system uses a similarity threshold of 0.2 (configurable) to filter relevant memories. Scores closer to 1.0 indicate higher similarity.Result Limits
By default, the system returns the top 3 most similar memories to avoid overwhelming the agent with context.Memory Search Patterns
Hierarchical Fallback
Memory searches follow a hierarchical pattern:- Specific Search: Query memories from current subtask
- Task-Level Search: If no results, expand to current task
- Flow-Level Search: If still no results, search entire flow
- Global Search: Optionally search across all flows (planned)
Agent-Specific Filtering
Agents can filter memories by their type to retrieve role-specific experiences:Embedding Models
PentAGI supports multiple embedding providers for generating vector representations:Supported Providers
Embedding Configuration
Context Management
As conversations grow longer, PentAGI implements intelligent context management to stay within model token limits:Chain Summarization
The system automatically summarizes older messages while preserving critical information:Summarization Configuration
- Global Settings
- Assistant Settings
What Gets Preserved
Always Kept:- System messages and initial prompts
- Recent messages (configurable count)
- Tool call structures and identifiers
- Critical error messages
- Older human questions
- Previous assistant responses
- Tool outputs (while preserving key findings)
- Redundant information from early conversation
- The last N QA pairs (configurable)
- Messages in the last section (configurable size)
- Tool call and response pairs (structure preserved)
Memory Types
Observation Memories
Raw factual information gathered during operations:Conclusion Memories
Higher-level insights derived from observations:Success Pattern Memories
Recorded techniques that achieved objectives:Integration with Knowledge Graph
While the vector store provides semantic search, the knowledge graph adds structured relationships. See Knowledge Graph for details on how these systems complement each other: Vector Store: “What memories are semantically similar to this query?” Knowledge Graph: “How are these entities related? What patterns connect them?”Performance Considerations
Indexing Strategy
PentAGI uses IVFFlat indexing for approximate nearest neighbor search:- Faster queries at the cost of slight accuracy loss
- Lists parameter tuned for typical pentest memory sizes
- Cosine similarity for normalized vectors
Query Optimization
Metadata Filtering: Apply filters before vector search to reduce candidates. Result Limiting: Default limit of 3 results balances context richness with token usage. Threshold Tuning: 0.2 similarity threshold filters out noise while retaining relevant memories.Memory Cleanup
Long-running flows may accumulate large memory stores. Consider:- Periodic archiving of old memories
- Consolidating similar memories through clustering
- Removing low-value observations after success patterns are extracted
Best Practices
Choose appropriate embedding models
Choose appropriate embedding models
- OpenAI text-embedding-3-large: Best accuracy, higher cost
- OpenAI text-embedding-3-small: Balanced performance
- Ollama nomic-embed-text: Free, local, good for development
- Jina v2: Optimized for long documents
Structure memory content effectively
Structure memory content effectively
- Include context in memory text (target, tool used, outcome)
- Use descriptive metadata for filtering
- Store atomic facts rather than large text blocks
- Maintain consistent formatting for similar memory types
Tune search parameters
Tune search parameters
- Lower threshold (0.1-0.2) for broader recall
- Higher threshold (0.3-0.5) for precision
- Increase result limit for research agents
- Decrease result limit for executor agents
Monitor memory usage
Monitor memory usage
- Track vector store size growth
- Monitor query performance over time
- Review memory retrieval relevance
- Adjust summarization settings based on context window usage
Related Concepts
- Architecture - System-wide memory integration
- Agent System - How agents use memory for decision-making
- Knowledge Graph - Complementary structured knowledge storage