TargetedPentestAgent

Overview

The TargetedPentestAgent is a pentest-focused specialization of the OffensiveSecurityAgent. It performs targeted security assessments against specific objectives, documents vulnerabilities with proof-of-concept exploits, and provides detailed remediation guidance. Unlike reconnaissance agents that perform broad discovery, the TargetedPentestAgent focuses deeply on testing specific endpoints or objectives that have already been identified.

Key Features

Objective-Driven Testing: Tests specific objectives (e.g., “Test for SQL injection on /api/users”)
Blackbox Methodology: No access to source code — tests purely through external interaction
Automatic PoC Generation: Creates proof-of-concept scripts for confirmed vulnerabilities
Findings Deduplication: Integrates with FindingsRegistry to prevent duplicate reporting
Session Management: Supports authenticated testing with session cookies/headers
Structured Results: Returns typed results with all findings and file paths

Constructor

new TargetedPentestAgent(opts: PentestAgentInput)

opts

PentestAgentInput

required

Configuration object for the pentest agent

PentestAgentInput

model

AIModel

required

AI model identifier (e.g., "claude-sonnet-4-20250514")

session

SessionInfo

required

Session providing paths for findings, POCs, logs, etc.

target

string

required

The URL / host to test

objectives

string[]

required

One or more testing objectives (e.g., “Test for SQL injection on /api/users”)

authConfig

AIAuthConfig

Optional per-provider API key overrides

onStepFinish

StreamTextOnStepFinishCallback<ToolSet>

Callback fired after each agent step

abortSignal

AbortSignal

AbortSignal to cancel the agent mid-run

sandbox

UnifiedSandbox

When set, tools execute inside this sandbox instead of locally

findingsRegistry

FindingsRegistry

Shared findings registry for cross-agent dedup

callbacks

ConsumeCallbacks

Callbacks for stream events

credentialManager

CredentialManager

In-memory credential store for secret-free agent prompts

stopWhen

StopCondition<ToolSet>

Override the default stop condition

Result Type

The consume() method returns a PentestResult:

findings

Finding[]

All findings discovered during the run

findingsPath

string

Absolute path to the session’s findings directory

pocsPath

string

Absolute path to the session’s POC scripts directory

Finding Object

title

string

Short descriptive title for the vulnerability

severity

'CRITICAL' | 'HIGH' | 'MEDIUM' | 'LOW'

Severity level of the vulnerability

description

string

Detailed description of the vulnerability

impact

string

Explanation of the potential impact if exploited

evidence

string

Evidence demonstrating the vulnerability (request/response, screenshots, etc.)

endpoint

string

The affected endpoint or URL

pocPath

string

Path to the proof-of-concept script

remediation

string

Step-by-step remediation guidance

references

string

External references (OWASP, CVEs, documentation)

toolCallDescription

string

Description of why this vulnerability was documented

Active Tools

The TargetedPentestAgent uses the following tools:

execute_command - Run exploit scripts and security testing tools
http_request - Send targeted HTTP requests to test endpoints
document_vulnerability - Document confirmed vulnerabilities
create_poc - Create proof-of-concept exploit scripts
response - Submit final testing summary (auto-injected)
browser_navigate - Navigate to pages for evidence collection
browser_snapshot - Capture DOM snapshots
browser_screenshot - Take screenshots for evidence
browser_click - Interact with page elements for interactive testing
browser_fill - Fill form fields for form-based attacks (XSS, injection)
email_list_inboxes - List available email inboxes (if configured)
email_list_messages - List messages (for testing email-based flows)
email_search_messages - Search for specific messages
email_get_message - Retrieve full message content

Usage Examples

Basic Vulnerability Testing

import { TargetedPentestAgent } from "@pensar/apex";
import { createSession } from "@pensar/apex/session";

const session = await createSession({
  name: "SQL Injection Test",
  targets: ["https://api.example.com"],
});

const agent = new TargetedPentestAgent({
  target: "https://api.example.com",
  objectives: [
    "Test for SQL injection on /api/users endpoint",
    "Test for authentication bypass",
  ],
  model: "claude-sonnet-4-20250514",
  session,
});

const { findings, findingsPath, pocsPath } = await agent.consume({
  onTextDelta: (delta) => {
    process.stdout.write(delta.text);
  },
  onToolCall: (delta) => {
    console.log(`→ ${delta.toolName}`);
  },
});

console.log(`\nFound ${findings.length} vulnerabilities`);
console.log(`Findings: ${findingsPath}`);
console.log(`POCs: ${pocsPath}`);

// Print findings summary
for (const finding of findings) {
  console.log(`\n[${finding.severity}] ${finding.title}`);
  console.log(`Endpoint: ${finding.endpoint}`);
  console.log(`PoC: ${finding.pocPath}`);
}

Authenticated Testing

import { TargetedPentestAgent } from "@pensar/apex";
import { createSession } from "@pensar/apex/session";
import { writeFileSync } from "fs";
import { join } from "path";

const session = await createSession({
  name: "Authenticated API Test",
  targets: ["https://api.example.com"],
});

// Create auth-data.json with authenticated session
const authDataPath = join(session.rootPath, "auth", "auth-data.json");
writeFileSync(authDataPath, JSON.stringify({
  authenticated: true,
  cookies: "session_id=abc123; token=xyz789",
  headers: {
    "Authorization": "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
    "X-API-Key": "secret_key_123",
  },
  strategy: "bearer_token",
}));

const agent = new TargetedPentestAgent({
  target: "https://api.example.com",
  objectives: [
    "Test for IDOR on /api/users/{id} endpoint",
    "Test for privilege escalation",
  ],
  model: "claude-sonnet-4-20250514",
  session,
});

const { findings } = await agent.consume({
  onTextDelta: (delta) => process.stdout.write(delta.text),
});

With Findings Registry (Deduplication)

import { TargetedPentestAgent } from "@pensar/apex";
import { FindingsRegistry } from "@pensar/apex/findings";
import { createSession } from "@pensar/apex/session";

const session = await createSession({
  name: "Multi-Agent Test",
  targets: ["https://example.com"],
});

const registry = new FindingsRegistry();

// First agent
const agent1 = new TargetedPentestAgent({
  target: "https://example.com",
  objectives: ["Test /api/login for SQL injection"],
  model: "claude-sonnet-4-20250514",
  session,
  findingsRegistry: registry,
});

const result1 = await agent1.consume();
console.log(`Agent 1 found ${result1.findings.length} vulnerabilities`);

// Second agent with same registry - won't duplicate findings
const agent2 = new TargetedPentestAgent({
  target: "https://example.com",
  objectives: ["Test /api/users for injection vulnerabilities"],
  model: "claude-sonnet-4-20250514",
  session,
  findingsRegistry: registry,
});

const result2 = await agent2.consume();
console.log(`Agent 2 found ${result2.findings.length} new vulnerabilities`);
console.log(`Total unique findings: ${registry.size}`);

With Sandbox Execution

import { TargetedPentestAgent } from "@pensar/apex";
import { createSandbox } from "@pensar/apex/sandbox";
import { createSession } from "@pensar/apex/session";

const session = await createSession({
  name: "Sandboxed Test",
  targets: ["https://example.com"],
});

const sandbox = await createSandbox({
  type: "docker",
  image: "pentest-tools:latest",
});

const agent = new TargetedPentestAgent({
  target: "https://example.com",
  objectives: ["Test for RCE on /api/upload"],
  model: "claude-sonnet-4-20250514",
  session,
  sandbox, // All execute_command and create_poc calls run in sandbox
});

const { findings } = await agent.consume({
  onTextDelta: (delta) => process.stdout.write(delta.text),
});

await sandbox.cleanup();

Exfil Mode (CTF/Red Team)

import { TargetedPentestAgent } from "@pensar/apex";
import { createSession } from "@pensar/apex/session";

const session = await createSession({
  name: "CTF Challenge",
  targets: ["https://ctf.example.com"],
  config: {
    exfilMode: true, // Enables data extraction focus
    outcomeGuidance: "Find and extract the flag in format FLAG{...}",
  },
});

const agent = new TargetedPentestAgent({
  target: "https://ctf.example.com",
  objectives: [
    "Find SSRF vulnerability",
    "Pivot through SSRF to access internal services",
    "Extract the flag",
  ],
  model: "claude-sonnet-4-20250514",
  session,
});

const { findings } = await agent.consume({
  onTextDelta: (delta) => process.stdout.write(delta.text),
});

Custom Stop Condition

import { TargetedPentestAgent } from "@pensar/apex";
import { createSession } from "@pensar/apex/session";
import { hasToolCall, stepCountIs } from "ai";

const session = await createSession({
  name: "Limited Test",
  targets: ["https://example.com"],
});

const agent = new TargetedPentestAgent({
  target: "https://example.com",
  objectives: ["Quick XSS scan"],
  model: "claude-sonnet-4-20250514",
  session,
  stopWhen: stepCountIs(50), // Stop after 50 steps regardless
});

const { findings } = await agent.consume();

Testing Methodology

The TargetedPentestAgent follows a structured testing methodology:

PLAN

States objectives and outlines testing plan before executing any tools. Describes which attack techniques, payloads, and tools will be used.

VERIFY

Confirms the target endpoint exists and is reachable. Understands basic behavior (response format, parameters, auth requirements).

PREPARE

Researches applicable payloads and attack techniques. Crafts payloads tailored to the target’s technology and behavior.

TEST

Executes targeted attacks methodically, one payload/technique at a time. Observes responses carefully for vulnerability indicators.

EXPLOIT

When a vulnerability is confirmed, creates a proof-of-concept script that reliably demonstrates it.

DOCUMENT

Documents every confirmed finding with evidence, impact assessment, and remediation steps.

FINISH

After testing ALL objectives, calls the response tool to submit final summary.

Testing Modes

Standard Mode
Exfil Mode

Standard blackbox penetration testing:

Test for common vulnerabilities
Document confirmed findings
Provide remediation guidance
No data extraction focus

Red team / CTF mode with data extraction focus:

Find and exploit vulnerabilities
Pivot through confirmed vulnerabilities
Discover internal services and resources
Extract sensitive data (flags, secrets, etc.)
Demonstrate full impact

Enable with session.config.exfilMode = true

Rate Limiting Handling

The agent automatically handles rate limiting:

Detects HTTP 429 responses
Uses exponential backoff:
- First retry: 5 seconds
- Second retry: 30 seconds
- Third retry: 120 seconds
After 3 attempts, notes rate limiting in summary and moves to next objective

Best Practices

Blackbox Only: The agent must NEVER read, view, or analyze source code. It performs blackbox testing through external interaction only.

PoC Required: Always create a working proof-of-concept script BEFORE calling document_vulnerability. The tool is only for confirmed, exploitable vulnerabilities.

Browser Tools: Use browser_screenshot to capture evidence of successful exploits (XSS alerts, error pages, authentication bypass, etc.)

No False Positives: Never document a vulnerability without confirming it. If unable to confirm, describe it in the final response summary instead.

Convenience Runner

import { runPentestAgent } from "@pensar/apex/agents/specialized/pentest";

const result = await runPentestAgent({
  target: "https://example.com",
  objectives: ["Test for XSS on /search"],
  model: "claude-sonnet-4-20250514",
  session,
});

// Automatically logs progress and results to console

API Overview

Core APIs

Agents

Overview

Key Features

Constructor

PentestAgentInput

Result Type

Finding Object

Active Tools

Usage Examples

Basic Vulnerability Testing

Authenticated Testing

With Findings Registry (Deduplication)

With Sandbox Execution

Exfil Mode (CTF/Red Team)

Custom Stop Condition

Testing Methodology

Testing Modes

Rate Limiting Handling

Best Practices

Convenience Runner

Build docs developers (and LLMs) love

API Overview

Core APIs

Agents

​Overview

​Key Features

​Constructor

​PentestAgentInput

​Result Type

​Finding Object

​Active Tools

​Usage Examples

​Basic Vulnerability Testing

​Authenticated Testing

​With Findings Registry (Deduplication)

​With Sandbox Execution

​Exfil Mode (CTF/Red Team)

​Custom Stop Condition

​Testing Methodology

​Testing Modes

​Rate Limiting Handling

​Best Practices

​Convenience Runner

​Related

Build docs developers (and LLMs) love

Overview

Key Features

Constructor

PentestAgentInput

Result Type

Finding Object

Active Tools

Usage Examples

Basic Vulnerability Testing

Authenticated Testing

With Findings Registry (Deduplication)

With Sandbox Execution

Exfil Mode (CTF/Red Team)

Custom Stop Condition

Testing Methodology

Testing Modes

Rate Limiting Handling

Best Practices

Convenience Runner

Related