Overview
The TargetedPentestAgent is a pentest-focused specialization of the OffensiveSecurityAgent. It performs targeted security assessments against specific objectives, documents vulnerabilities with proof-of-concept exploits, and provides detailed remediation guidance.
Unlike reconnaissance agents that perform broad discovery, the TargetedPentestAgent focuses deeply on testing specific endpoints or objectives that have already been identified.
Key Features
- Objective-Driven Testing: Tests specific objectives (e.g., “Test for SQL injection on /api/users”)
- Blackbox Methodology: No access to source code — tests purely through external interaction
- Automatic PoC Generation: Creates proof-of-concept scripts for confirmed vulnerabilities
- Findings Deduplication: Integrates with FindingsRegistry to prevent duplicate reporting
- Session Management: Supports authenticated testing with session cookies/headers
- Structured Results: Returns typed results with all findings and file paths
Constructor
new TargetedPentestAgent(opts: PentestAgentInput)
opts
PentestAgentInput
required
Configuration object for the pentest agent
AI model identifier (e.g., "claude-sonnet-4-20250514")
Session providing paths for findings, POCs, logs, etc.
One or more testing objectives (e.g., “Test for SQL injection on /api/users”)
Optional per-provider API key overrides
onStepFinish
StreamTextOnStepFinishCallback<ToolSet>
Callback fired after each agent step
AbortSignal to cancel the agent mid-run
When set, tools execute inside this sandbox instead of locally
Shared findings registry for cross-agent dedup
Callbacks for stream events
In-memory credential store for secret-free agent prompts
Override the default stop condition
Result Type
The consume() method returns a PentestResult:
All findings discovered during the run
Absolute path to the session’s findings directory
Absolute path to the session’s POC scripts directory
Finding Object
Short descriptive title for the vulnerability
severity
'CRITICAL' | 'HIGH' | 'MEDIUM' | 'LOW'
Severity level of the vulnerability
Detailed description of the vulnerability
Explanation of the potential impact if exploited
Evidence demonstrating the vulnerability (request/response, screenshots, etc.)
The affected endpoint or URL
Path to the proof-of-concept script
Step-by-step remediation guidance
External references (OWASP, CVEs, documentation)
Description of why this vulnerability was documented
The TargetedPentestAgent uses the following tools:
execute_command - Run exploit scripts and security testing tools
http_request - Send targeted HTTP requests to test endpoints
document_vulnerability - Document confirmed vulnerabilities
create_poc - Create proof-of-concept exploit scripts
response - Submit final testing summary (auto-injected)
browser_navigate - Navigate to pages for evidence collection
browser_snapshot - Capture DOM snapshots
browser_screenshot - Take screenshots for evidence
browser_click - Interact with page elements for interactive testing
browser_fill - Fill form fields for form-based attacks (XSS, injection)
email_list_inboxes - List available email inboxes (if configured)
email_list_messages - List messages (for testing email-based flows)
email_search_messages - Search for specific messages
email_get_message - Retrieve full message content
Usage Examples
Basic Vulnerability Testing
import { TargetedPentestAgent } from "@pensar/apex";
import { createSession } from "@pensar/apex/session";
const session = await createSession({
name: "SQL Injection Test",
targets: ["https://api.example.com"],
});
const agent = new TargetedPentestAgent({
target: "https://api.example.com",
objectives: [
"Test for SQL injection on /api/users endpoint",
"Test for authentication bypass",
],
model: "claude-sonnet-4-20250514",
session,
});
const { findings, findingsPath, pocsPath } = await agent.consume({
onTextDelta: (delta) => {
process.stdout.write(delta.text);
},
onToolCall: (delta) => {
console.log(`→ ${delta.toolName}`);
},
});
console.log(`\nFound ${findings.length} vulnerabilities`);
console.log(`Findings: ${findingsPath}`);
console.log(`POCs: ${pocsPath}`);
// Print findings summary
for (const finding of findings) {
console.log(`\n[${finding.severity}] ${finding.title}`);
console.log(`Endpoint: ${finding.endpoint}`);
console.log(`PoC: ${finding.pocPath}`);
}
Authenticated Testing
import { TargetedPentestAgent } from "@pensar/apex";
import { createSession } from "@pensar/apex/session";
import { writeFileSync } from "fs";
import { join } from "path";
const session = await createSession({
name: "Authenticated API Test",
targets: ["https://api.example.com"],
});
// Create auth-data.json with authenticated session
const authDataPath = join(session.rootPath, "auth", "auth-data.json");
writeFileSync(authDataPath, JSON.stringify({
authenticated: true,
cookies: "session_id=abc123; token=xyz789",
headers: {
"Authorization": "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"X-API-Key": "secret_key_123",
},
strategy: "bearer_token",
}));
const agent = new TargetedPentestAgent({
target: "https://api.example.com",
objectives: [
"Test for IDOR on /api/users/{id} endpoint",
"Test for privilege escalation",
],
model: "claude-sonnet-4-20250514",
session,
});
const { findings } = await agent.consume({
onTextDelta: (delta) => process.stdout.write(delta.text),
});
With Findings Registry (Deduplication)
import { TargetedPentestAgent } from "@pensar/apex";
import { FindingsRegistry } from "@pensar/apex/findings";
import { createSession } from "@pensar/apex/session";
const session = await createSession({
name: "Multi-Agent Test",
targets: ["https://example.com"],
});
const registry = new FindingsRegistry();
// First agent
const agent1 = new TargetedPentestAgent({
target: "https://example.com",
objectives: ["Test /api/login for SQL injection"],
model: "claude-sonnet-4-20250514",
session,
findingsRegistry: registry,
});
const result1 = await agent1.consume();
console.log(`Agent 1 found ${result1.findings.length} vulnerabilities`);
// Second agent with same registry - won't duplicate findings
const agent2 = new TargetedPentestAgent({
target: "https://example.com",
objectives: ["Test /api/users for injection vulnerabilities"],
model: "claude-sonnet-4-20250514",
session,
findingsRegistry: registry,
});
const result2 = await agent2.consume();
console.log(`Agent 2 found ${result2.findings.length} new vulnerabilities`);
console.log(`Total unique findings: ${registry.size}`);
With Sandbox Execution
import { TargetedPentestAgent } from "@pensar/apex";
import { createSandbox } from "@pensar/apex/sandbox";
import { createSession } from "@pensar/apex/session";
const session = await createSession({
name: "Sandboxed Test",
targets: ["https://example.com"],
});
const sandbox = await createSandbox({
type: "docker",
image: "pentest-tools:latest",
});
const agent = new TargetedPentestAgent({
target: "https://example.com",
objectives: ["Test for RCE on /api/upload"],
model: "claude-sonnet-4-20250514",
session,
sandbox, // All execute_command and create_poc calls run in sandbox
});
const { findings } = await agent.consume({
onTextDelta: (delta) => process.stdout.write(delta.text),
});
await sandbox.cleanup();
Exfil Mode (CTF/Red Team)
import { TargetedPentestAgent } from "@pensar/apex";
import { createSession } from "@pensar/apex/session";
const session = await createSession({
name: "CTF Challenge",
targets: ["https://ctf.example.com"],
config: {
exfilMode: true, // Enables data extraction focus
outcomeGuidance: "Find and extract the flag in format FLAG{...}",
},
});
const agent = new TargetedPentestAgent({
target: "https://ctf.example.com",
objectives: [
"Find SSRF vulnerability",
"Pivot through SSRF to access internal services",
"Extract the flag",
],
model: "claude-sonnet-4-20250514",
session,
});
const { findings } = await agent.consume({
onTextDelta: (delta) => process.stdout.write(delta.text),
});
Custom Stop Condition
import { TargetedPentestAgent } from "@pensar/apex";
import { createSession } from "@pensar/apex/session";
import { hasToolCall, stepCountIs } from "ai";
const session = await createSession({
name: "Limited Test",
targets: ["https://example.com"],
});
const agent = new TargetedPentestAgent({
target: "https://example.com",
objectives: ["Quick XSS scan"],
model: "claude-sonnet-4-20250514",
session,
stopWhen: stepCountIs(50), // Stop after 50 steps regardless
});
const { findings } = await agent.consume();
Testing Methodology
The TargetedPentestAgent follows a structured testing methodology:
PLAN
States objectives and outlines testing plan before executing any tools. Describes which attack techniques, payloads, and tools will be used.
VERIFY
Confirms the target endpoint exists and is reachable. Understands basic behavior (response format, parameters, auth requirements).
PREPARE
Researches applicable payloads and attack techniques. Crafts payloads tailored to the target’s technology and behavior.
TEST
Executes targeted attacks methodically, one payload/technique at a time. Observes responses carefully for vulnerability indicators.
EXPLOIT
When a vulnerability is confirmed, creates a proof-of-concept script that reliably demonstrates it.
DOCUMENT
Documents every confirmed finding with evidence, impact assessment, and remediation steps.
FINISH
After testing ALL objectives, calls the response tool to submit final summary.
Testing Modes
Standard blackbox penetration testing:
- Test for common vulnerabilities
- Document confirmed findings
- Provide remediation guidance
- No data extraction focus
Red team / CTF mode with data extraction focus:
- Find and exploit vulnerabilities
- Pivot through confirmed vulnerabilities
- Discover internal services and resources
- Extract sensitive data (flags, secrets, etc.)
- Demonstrate full impact
Enable with session.config.exfilMode = true
Rate Limiting Handling
The agent automatically handles rate limiting:
- Detects HTTP 429 responses
- Uses exponential backoff:
- First retry: 5 seconds
- Second retry: 30 seconds
- Third retry: 120 seconds
- After 3 attempts, notes rate limiting in summary and moves to next objective
Best Practices
Blackbox Only: The agent must NEVER read, view, or analyze source code. It performs blackbox testing through external interaction only.
PoC Required: Always create a working proof-of-concept script BEFORE calling document_vulnerability. The tool is only for confirmed, exploitable vulnerabilities.
Browser Tools: Use browser_screenshot to capture evidence of successful exploits (XSS alerts, error pages, authentication bypass, etc.)
No False Positives: Never document a vulnerability without confirming it. If unable to confirm, describe it in the final response summary instead.
Convenience Runner
import { runPentestAgent } from "@pensar/apex/agents/specialized/pentest";
const result = await runPentestAgent({
target: "https://example.com",
objectives: ["Test for XSS on /search"],
model: "claude-sonnet-4-20250514",
session,
});
// Automatically logs progress and results to console