Skip to main content

Overview

Attack surface discovery is the reconnaissance phase where Pensar Apex identifies all entry points, endpoints, services, and authentication flows in your application. This phase is critical because it determines what gets tested in subsequent penetration testing phases. Pensar Apex supports two discovery modes:

Blackbox Mode

Probes a live target from the outside with no source code access. Mimics how an external attacker would discover your application.

Whitebox Mode

Analyzes source code directly to extract routes, endpoints, and authentication flows. Provides complete coverage of your API surface.

Blackbox Attack Surface Discovery

In blackbox mode, the agent treats your application as a completely opaque system and discovers its attack surface through external observation.

Discovery Phases

The blackbox attack surface agent follows a systematic methodology:
1

Authentication (if credentials provided)

If you provide credentials, the agent authenticates first to discover protected endpoints and authenticated functionality.
const session = await sessions.create({
  name: "Blackbox Test",
  targets: ["https://example.com"],
  config: {
    authCredentials: {
      username: "testuser",
      password: "testpass",
      loginUrl: "https://example.com/login",
    },
  },
});
2

Subdomain Enumeration (optional)

When enabled, the agent discovers subdomains using:
  • DNS brute-forcing with wordlists
  • Certificate Transparency logs
  • DNS zone transfers (if misconfigured)
config: {
  enumerateSubdomains: true,
}
3

Service Discovery

The agent probes for running services using:
  • Port scanning (nmap)
  • HTTP/HTTPS probing
  • Service fingerprinting
  • Technology detection
4

Web Crawling

For web applications, the agent:
  • Crawls HTML pages and follows links
  • Executes JavaScript to discover SPA routes
  • Extracts API endpoints from JavaScript bundles
  • Maps authentication and form flows
5

API Endpoint Discovery

The agent discovers API endpoints through:
  • JavaScript source analysis
  • Common path enumeration
  • OpenAPI/Swagger discovery
  • GraphQL introspection
6

Asset Documentation

All discoveries are documented using the document_asset tool:
  • Domains and subdomains
  • Open ports and services
  • Web pages and routes
  • API endpoints
  • Authentication mechanisms

Blackbox Agent Configuration

import { BlackboxAttackSurfaceAgent } from "@/core/agents/specialized/attackSurface";
import { sessions } from "@/core/session";

const session = await sessions.create({
  name: "Attack Surface Analysis",
  targets: ["https://example.com"],
  config: {
    // Optional: Provide credentials for authenticated discovery
    authCredentials: {
      username: "testuser",
      password: "testpass",
      loginUrl: "https://example.com/login",
    },
    // Optional: Enable subdomain enumeration
    enumerateSubdomains: true,
    // Optional: Restrict scope to specific hosts
    scopeConstraints: {
      strictScope: true,
      allowedHosts: ["example.com", "*.example.com"],
    },
  },
});

const agent = new BlackboxAttackSurfaceAgent({
  target: "https://example.com",
  model: "claude-sonnet-4-20250514",
  session,
});

const result = await agent.consume({
  onTextDelta: (d) => process.stdout.write(d.text),
  onToolCall: (d) => console.log(`→ ${d.toolName}`),
});

console.log(`Discovered ${result.results?.discoveredAssets.length} assets`);
console.log(`Identified ${result.targets.length} high-priority targets`);

Blackbox Discovery Tools

The blackbox agent uses these tools:
Runs reconnaissance commands:
  • nmap for port scanning
  • dig for DNS queries
  • curl for HTTP probing
  • subfinder for subdomain enumeration
Loads web pages in a headless browser to:
  • Execute JavaScript and discover SPA routes
  • Capture rendered content
  • Follow navigation flows
Captures the DOM to:
  • Extract links and forms
  • Identify authentication mechanisms
  • Map page structure
Records discovered assets:
{
  tool: "document_asset",
  args: {
    assetType: "api-endpoint",
    identifier: "https://api.example.com/v1/users",
    description: "User management API endpoint",
    details: {
      method: "GET",
      requiresAuth: true,
      technology: "REST",
    },
  },
}
Generates the final report when discovery is complete. This tool triggers the stop condition.

Whitebox Attack Surface Discovery

In whitebox mode, the agent analyzes your application’s source code to extract the complete attack surface with 100% accuracy.

How It Works

import { runAttackSurfaceAgent } from "@/core/api/attackSurface";

const result = await runAttackSurfaceAgent({
  target: "https://example.com",
  cwd: "/path/to/source",  // Presence of cwd enables whitebox mode
  model: "claude-sonnet-4-20250514",
  session,
});
When cwd is provided, Pensar Apex:
  1. Detects the framework (Express, FastAPI, Rails, Django, etc.)
  2. Extracts routes from framework-specific routing files
  3. Maps endpoints to their HTTP methods and parameters
  4. Identifies authentication requirements and middleware
  5. Cross-references with the live target to verify accessibility

Supported Frameworks

Express.js

  • Route definitions
  • Middleware chains
  • REST and GraphQL

FastAPI

  • Path operations
  • Pydantic schemas
  • OAuth2 flows

Django

  • URL patterns
  • Class-based views
  • Django REST Framework

Ruby on Rails

  • routes.rb definitions
  • Controller actions
  • API mode endpoints

Spring Boot

  • @RequestMapping
  • @RestController
  • Spring Security

Next.js

  • App Router routes
  • API routes
  • Server Actions

Whitebox Output Example

{
  "summary": {
    "totalApps": 1,
    "totalApiEndpoints": 42,
    "totalPages": 15,
    "analysisComplete": true
  },
  "applications": [
    {
      "name": "Main API",
      "framework": "Express.js",
      "endpoints": [
        {
          "path": "/api/users",
          "method": "GET",
          "authentication": "JWT",
          "parameters": ["page", "limit"],
          "sourceLocation": "src/routes/users.js:12"
        }
      ]
    }
  ]
}

Attack Surface Output

Both modes produce an AttackSurfaceResult:
export interface AttackSurfaceResult {
  /** The full analysis results */
  results: AttackSurfaceAnalysisResults | null;
  
  /** High-priority targets for penetration testing */
  targets: PentestTarget[];
  
  /** Path to the attack-surface-results.json file */
  resultsPath: string;
  
  /** Path to the session's assets directory */
  assetsPath: string;
}

export interface AttackSurfaceAnalysisResults {
  summary: AttackSurfaceSummary;
  discoveredAssets: string[];
  targets: PentestTarget[];
  keyFindings: string[];
}

export interface PentestTarget {
  /** The target URL or endpoint */
  target: string;
  
  /** Testing objective (e.g., "Test for SQL injection") */
  objective: string;
  
  /** Why this target is high-priority */
  rationale: string;
  
  /** Authentication info (if available) */
  authenticationInfo?: {
    method: string;
    details: string;
    cookies?: string;
    headers?: string;
  };
}

What Gets Mapped

  • REST API endpoints
  • GraphQL endpoints
  • WebSocket connections
  • gRPC services
  • HTTP methods (GET, POST, PUT, DELETE, etc.)
  • Query parameters and request bodies

Scope Control

Control what the agent discovers with scope constraints:
config: {
  scopeConstraints: {
    // Strict mode: only test URLs within allowed hosts
    strictScope: true,
    
    // Allowed hosts (supports wildcards)
    allowedHosts: [
      "example.com",
      "*.example.com",
      "api.partner.com",
    ],
    
    // Allowed ports
    allowedPorts: [80, 443, 8080],
    
    // Excluded paths
    excludedPaths: [
      "/admin/delete",
      "/api/payments/charge",
    ],
  },
}
Strict scope mode prevents the agent from scanning outside allowed hosts. This is essential for bug bounty programs and production testing where scope violations can have serious consequences.

Authentication During Discovery

If you provide credentials, the agent will authenticate before discovering the attack surface:
config: {
  authCredentials: {
    username: "testuser",
    password: "testpass",
    loginUrl: "https://example.com/login",
    
    // Optional: Additional credential fields
    credentialType: "username-password",
    additionalFields: {
      apiKey: "your-api-key",
      mfaToken: "123456",
    },
  },
  
  // Optional: Custom authentication instructions
  authenticationInstructions: `
    1. Navigate to the login page
    2. Fill in the username and password
    3. Click the "Sign In" button
    4. Wait for the dashboard to load
  `,
}
The agent will:
  1. Authenticate using the provided credentials
  2. Export the authenticated session (cookies, tokens)
  3. Use the session for all subsequent discovery
  4. Include authentication info with high-priority targets

Prioritization

The agent automatically prioritizes targets based on:
  • Risk factors: Admin panels, API endpoints, file uploads, authentication flows
  • Technology indicators: Outdated frameworks, known vulnerable libraries
  • Complexity: Endpoints with many parameters, complex authentication
  • Exposure: Publicly accessible vs. authenticated-only
// Example prioritized target
{
  target: "https://example.com/api/admin/users",
  objective: "Test for authorization bypass and privilege escalation",
  rationale: "Admin endpoint accessible after authentication. May be vulnerable to IDOR or missing authorization checks.",
  authenticationInfo: {
    method: "Cookie-based",
    details: "Session cookie from login flow",
    cookies: "sessionId=abc123; userId=42",
  },
}

Best Practices

  • Use blackbox for external security assessments and bug bounties
  • Use whitebox for internal testing and pre-deployment validation
  • Consider running both modes to compare coverage
  • Always provide credentials if the application has authentication
  • Authenticated discovery finds 3-5x more endpoints than unauthenticated
  • Include all user roles to discover role-specific endpoints
  • Enable strictScope for production environments
  • Test scope configuration with a dry run first
  • Document excluded paths and rationale
  • Manually inspect the attack surface report
  • Verify that critical endpoints are discovered
  • Check for false positives in asset list
  • Adjust scope and re-run if needed

Example: Complete Discovery Flow

import { sessions } from "@/core/session";
import { runAttackSurfaceAgent } from "@/core/api/attackSurface";

// Create session with authentication
const session = await sessions.create({
  name: "E-commerce Attack Surface",
  targets: ["https://shop.example.com"],
  config: {
    authCredentials: {
      username: "[email protected]",
      password: "SecurePass123!",
      loginUrl: "https://shop.example.com/login",
    },
    scopeConstraints: {
      strictScope: true,
      allowedHosts: ["shop.example.com", "api.shop.example.com"],
    },
    enumerateSubdomains: true,
  },
});

// Run discovery
const result = await runAttackSurfaceAgent({
  target: "https://shop.example.com",
  model: "claude-sonnet-4-20250514",
  session,
  callbacks: {
    onTextDelta: (d) => process.stdout.write(d.text),
  },
});

// Review results
console.log(`\n=== Attack Surface Analysis ===");
console.log(`Total assets: ${result.results?.summary.totalAssets}`);
console.log(`High-priority targets: ${result.targets.length}\n`);

for (const target of result.targets) {
  console.log(`Target: ${target.target}`);
  console.log(`Objective: ${target.objective}`);
  console.log(`Rationale: ${target.rationale}\n`);
}

// Save for later testing
console.log(`Results saved to: ${result.resultsPath}`);
console.log(`Assets saved to: ${result.assetsPath}`);

Agent Architecture

Learn about the agent system that powers attack surface discovery

Penetration Testing

Use discovered targets for vulnerability testing

Session Management

Understand how sessions store discovery results

API Reference

Complete API documentation for attack surface agents

Build docs developers (and LLMs) love