Overview
PentAGI allows you to create custom AI assistants tailored to specific penetration testing scenarios. Each assistant can be configured with different LLM models, agent behaviors, and specialized capabilities.Custom assistants give you fine-grained control over AI behavior, model selection, and agent delegation for different testing scenarios.
Assistant Architecture
PentAGI uses a multi-agent system where different agents handle different aspects of penetration testing:Creating Your First Assistant
Configure basic settings
Enter the assistant details:
- Name: Descriptive name (e.g., “Web App Specialist”)
- Description: Purpose and capabilities
- Use Agents: Toggle to enable/disable agent delegation
Select LLM provider
Choose from configured providers:
- OpenAI (GPT-4.1, o-series)
- Anthropic (Claude 4, Claude 3.7)
- Google Gemini (2.5 series)
- AWS Bedrock (multi-provider)
- Ollama (local models)
- Custom (OpenAI-compatible APIs)
Agent Delegation
Agent delegation allows the primary assistant to distribute work among specialized sub-agents.When to Use Agents
- Enable Agents
- Disable Agents
Best for:
- Complex, multi-step penetration tests
- Tasks requiring different specialized skills
- Long-running engagements
- Scenarios needing research + execution
- Better task decomposition
- Specialized expertise per task
- Parallel execution capability
- Improved context management
The default behavior is controlled by the
ASSISTANT_USE_AGENTS environment variable, but can be toggled per-assistant in the UI.Provider-Specific Configuration
Different LLM providers have unique capabilities and configuration options.Using Provider Config Files
For advanced control, create custom provider configuration files in YAML format:Configuring Provider Paths
Set provider configuration in environment variables:- Custom OpenAI
- Ollama Local
- OpenRouter
.env
Agent Types and Roles
Understand the different agent types and their specializations:Simple Agent
Use cases and configuration
Use cases and configuration
Purpose: Fast, lightweight tasks requiring minimal reasoningTypical tasks:
- JSON parsing and formatting
- Quick data lookups
- Simple transformations
- Status checks
- GPT-4.1-mini (OpenAI)
- Claude 3.5 Haiku (Anthropic)
- Gemini 2.5 Flash (Google)
- Llama 3.1 8B (Ollama)
- Keep max_tokens low (2000-3000)
- Use moderate temperature (0.5-0.7)
- Prioritize speed over reasoning depth
Primary Agent
Use cases and configuration
Use cases and configuration
Purpose: Main orchestration and decision-makingTypical tasks:
- Task planning and decomposition
- Agent delegation decisions
- Result synthesis
- Strategic thinking
- o3-mini, o4-mini (OpenAI reasoning)
- Claude Sonnet 4 (Anthropic)
- Gemini 2.5 Pro Thinking (Google)
- Qwen3 32B (Ollama)
- Enable reasoning for complex decisions
- Use lower temperature (0.2-0.3) for consistency
- Allow moderate max_tokens (4000-6000)
Assistant Agent
Use cases and configuration
Use cases and configuration
Purpose: Specialized sub-agent for delegated tasksTypical tasks:
- Focused research
- Specific vulnerability testing
- Tool execution planning
- Result analysis
- o3-mini with medium reasoning (OpenAI)
- Claude 3.7 Extended Thinking (Anthropic)
- Gemini 3 Flash Preview (Google)
- QwQ 32B (Ollama reasoning)
- Higher max_tokens (6000-8000) for detailed work
- Medium reasoning effort for balanced performance
- Adjust temperature based on task creativity needs
Researcher Agent
Use cases and configuration
Use cases and configuration
Purpose: Information gathering and reconnaissanceTypical tasks:
- Target enumeration
- Technology stack identification
- Vulnerability research
- OSINT gathering
- GPT-4.1-mini (OpenAI)
- Claude Haiku 4.5 (Anthropic)
- Gemini 2.5 Flash (Google)
- Higher temperature (0.7-0.8) for exploration
- Moderate max_tokens (4000)
- Enable web search capabilities
Developer/Coder Agent
Use cases and configuration
Use cases and configuration
Purpose: Exploit development and payload creationTypical tasks:
- Writing exploit code
- Creating custom payloads
- Tool script generation
- Bypass technique development
- GPT-4.1 (OpenAI)
- Claude Sonnet 4 (Anthropic)
- Gemini 2.5 Pro (Google)
- Qwen3 32B (Ollama)
- Low temperature (0.2-0.3) for precision
- Low top_p (0.1-0.3) for deterministic output
- Higher max_tokens (6000-8000) for complete code
Pentester Agent
Use cases and configuration
Use cases and configuration
Purpose: Active penetration testing and exploitationTypical tasks:
- Running security tools
- Executing exploits
- Vulnerability validation
- Post-exploitation activities
- o3-mini with low reasoning (OpenAI)
- Claude Sonnet 4 (Anthropic)
- Gemini 2.5 Flash Thinking (Google)
- Moderate max_tokens (4000)
- Low reasoning effort for faster execution
- Balance between speed and accuracy
Advanced Configuration Examples
Multi-Model Strategy
Use different models for different agents to optimize cost and performance:Creating Extended Context Ollama Models
PentAGI requires models with larger context windows (110K tokens) for complex penetration testing scenarios.Qwen3 32B with Extended Context
QwQ 32B Reasoning Model
Assistant Specialization Examples
Web Application Specialist
- Use Agents: Enabled
- Primary Model: Claude Sonnet 4 or o3-mini
- Tools Focus: sqlmap, commix, nikto, burpsuite
Network Infrastructure Specialist
- Use Agents: Enabled
- Primary Model: o3-mini or Gemini 2.5 Pro
- Tools Focus: nmap, metasploit, masscan, enum4linux
API Security Specialist
- Use Agents: Disabled (focused tasks)
- Primary Model: GPT-4.1 or Claude Haiku 4.5
- Tools Focus: curl, jwt_tool, graphql-playground
Testing Assistant Configuration
Verify your assistant works correctly:Monitor agent behavior
Observe:
- Whether agents are delegated (if enabled)
- Tool selection and execution
- Response quality and accuracy
- Token usage and cost
Best Practices
Model selection
Model selection
- Use reasoning models (o3, Claude 3.7, QwQ) for complex strategic tasks
- Use fast models (GPT-4.1-mini, Haiku, Flash) for simple operations
- Balance cost vs performance based on testing requirements
- Consider local models (Ollama) for privacy-sensitive engagements
Agent delegation
Agent delegation
- Enable agents for multi-step, complex engagements
- Disable agents for quick, focused vulnerability checks
- Use specialized agents to leverage different model strengths
- Monitor token usage to optimize delegation strategy
Temperature tuning
Temperature tuning
- Low (0.2-0.3): Code generation, exploit development, precise tasks
- Medium (0.5-0.7): General pentesting, balanced exploration
- High (0.7-0.9): Research, creative bypass techniques, OSINT
Context management
Context management
- Use provider configs to set per-agent token limits
- Leverage summarization for long engagements
- Enable Graphiti knowledge graph for semantic memory
- Monitor memory usage in long-running tests
Next Steps
Advanced Techniques
Learn advanced pentesting workflows
Best Practices
Security and ethical guidelines
First Pentest
Put your assistant to work
Provider Configuration
Deep dive into provider settings