How It Works

Strix is an autonomous penetration testing framework that combines AI agents, specialized security knowledge, and runtime sandboxing to discover vulnerabilities in your applications.

Architecture Overview

At its core, Strix orchestrates multiple components working together:

AI Agents

LLM-powered security experts that reason about targets and execute tests

Tools

Specialized capabilities for terminal, browser, proxy, file manipulation

Skills

Domain-specific security knowledge injected into agent context

Runtime

Isolated sandboxes providing safe execution environments

Execution Flow

When you start a scan, Strix follows this workflow:

1. Scan Initialization

Strix creates a root agent and initializes a sandbox environment:

# From strix/agents/StrixAgent/strix_agent.py
async def execute_scan(self, scan_config: dict[str, Any]) -> dict[str, Any]:
    user_instructions = scan_config.get("user_instructions", "")
    targets = scan_config.get("targets", [])
    
    # Process different target types
    for target in targets:
        target_type = target["type"]
        if target_type == "repository":
            # Clone and analyze repositories
        elif target_type == "web_application":
            # Test web endpoints

The root agent receives:

Target information (URLs, repositories, IP addresses)
User instructions (custom testing requirements)
Sandbox workspace (isolated environment with tools)

2. Agent Loop

Each agent operates in a continuous reasoning loop: The agent loop handles:

Iteration Management

Each agent has a maximum iteration limit (default: 300) to prevent infinite loops. As the agent approaches the limit, it receives warnings to prioritize task completion.

# From strix/agents/base_agent.py
if self.state.is_approaching_max_iterations():
    warning_msg = (
        f"URGENT: You are approaching the maximum iteration limit. "
        f"Current: {self.state.iteration}/{self.state.max_iterations}"
    )
    self.state.add_message("user", warning_msg)

State Management

Agent state tracks the full execution context:

agent_id: Unique identifier
messages: Conversation history with LLM
actions_taken: Tool invocations performed
context: Custom data storage
errors: Any failures encountered
iteration: Current loop iteration

Tool Execution

Tools execute either locally (in the Strix CLI) or remotely (in the sandbox):

# From strix/tools/executor.py
async def execute_tool(tool_name: str, agent_state, **kwargs):
    execute_in_sandbox = should_execute_in_sandbox(tool_name)
    
    if execute_in_sandbox and not sandbox_mode:
        return await _execute_tool_in_sandbox(tool_name, agent_state, **kwargs)
    
    return await _execute_tool_locally(tool_name, agent_state, **kwargs)

3. Multi-Agent Coordination

Strix can spawn specialized sub-agents for complex tasks:

# From strix/tools/agents_graph/agents_graph_actions.py
@register_tool
def create_agent(
    task: str,
    name: str,
    skills: str | None = None,
    inherit_messages: bool = False,
) -> dict[str, Any]:
    # Create new agent with specialized skills
    new_agent = StrixAgent(config={
        "state": new_state,
        "llm_config": llm_config,
    })

Sub-agents share the same workspace and proxy history but maintain independent conversation contexts. This enables parallel testing while building on previous discoveries.

4. Vulnerability Reporting

When agents discover security issues, they create structured reports:

# From strix/tools/reporting/reporting_actions.py
@register_tool
def create_vulnerability_report(
    title: str,
    description: str,
    impact: str,
    target: str,
    technical_analysis: str,
    poc_description: str,
    poc_script_code: str,  # Required: actual exploit code
    remediation_steps: str,
    cvss_breakdown: str,
    endpoint: str | None = None,
    cve: str | None = None,
    cwe: str | None = None,
):
    # Validates CVSS metrics and creates report
    # Automatically checks for duplicate findings

Reports include:

CVSS scoring (automatic calculation from metrics)
Proof-of-concept code (executable exploit)
Duplicate detection (prevents redundant findings)
Code locations (vulnerable files and line numbers)

Sandbox Architecture

Strix sandboxes provide isolated environments where agents can safely execute commands, browse applications, and test for vulnerabilities without affecting your local system.

Each sandbox includes:

Tool server: HTTP API for executing tools (terminal, browser, file operations)
Caido proxy: Intercepts and logs all HTTP/HTTPS traffic
Workspace: Shared /workspace directory for code analysis
Isolated network: Contained environment with controlled internet access

# Sandbox initialization from strix/agents/base_agent.py
runtime = get_runtime()
sandbox_info = await runtime.create_sandbox(
    self.state.agent_id,
    self.state.sandbox_token,
    self.local_sources  # Upload local code to sandbox
)

self.state.sandbox_id = sandbox_info["workspace_id"]
self.state.sandbox_info = sandbox_info

Agent Communication

Agents can send messages to each other for coordination:

<inter_agent_message>
  <sender>
    <agent_name>Auth Specialist</agent_name>
    <agent_id>agent_abc123</agent_id>
  </sender>
  <message_metadata>
    <type>information</type>
    <priority>high</priority>
  </message_metadata>
  <content>
    Found JWT token in localStorage: eyJhbGc...
    You can use this for authenticated endpoint testing.
  </content>
</inter_agent_message>

LLM Integration

Strix supports multiple LLM providers with structured output:

Anthropic Claude: Extended thinking, tool use
OpenAI GPT-4: Function calling, structured responses
Google Gemini: Multi-modal analysis
OpenRouter: Access to multiple models

# From strix/agents/base_agent.py
async for response in self.llm.generate(self.state.get_conversation_history()):
    if response.tool_invocations:
        # Execute tools requested by LLM
        await process_tool_invocations(
            response.tool_invocations,
            conversation_history,
            self.state
        )

State Persistence

Agent state is tracked throughout execution:

# From strix/agents/state.py
class AgentState(BaseModel):
    agent_id: str
    agent_name: str
    parent_id: str | None = None
    sandbox_id: str | None = None
    
    task: str
    iteration: int = 0
    max_iterations: int = 300
    completed: bool = False
    
    messages: list[dict[str, Any]]  # Full conversation
    actions_taken: list[dict[str, Any]]  # Tool executions
    errors: list[str]  # Failures encountered

This enables:

Resume from interruptions: Continue scans after pauses
Debugging: Review full execution history
Analytics: Track agent performance and behavior

Next Steps

Agents

Learn about agent types and capabilities

Tools

Explore available tools and their usage

Skills

Understand the skills system

Vulnerability Detection

See how Strix finds security issues

Get Started

Core Concepts

Usage

Skills

Integrations

Configuration

How It Works

Architecture Overview

AI Agents

Tools

Skills

Runtime

Execution Flow

1. Scan Initialization

2. Agent Loop

3. Multi-Agent Coordination

4. Vulnerability Reporting

Sandbox Architecture

Agent Communication

LLM Integration

State Persistence

Next Steps

Agents

Tools

Skills

Vulnerability Detection

Build docs developers (and LLMs) love

Get Started

Core Concepts

Usage

Skills

Integrations

Configuration

​Architecture Overview

AI Agents

Tools

Skills

Runtime

​Execution Flow

​1. Scan Initialization

​2. Agent Loop

​3. Multi-Agent Coordination

​4. Vulnerability Reporting

​Sandbox Architecture

​Agent Communication

​LLM Integration

​State Persistence

​Next Steps

Agents

Tools

Skills

Vulnerability Detection

Build docs developers (and LLMs) love

Architecture Overview

Execution Flow

1. Scan Initialization

2. Agent Loop

3. Multi-Agent Coordination

4. Vulnerability Reporting

Sandbox Architecture

Agent Communication

LLM Integration

State Persistence

Next Steps