Architecture Overview
Stagehand’s V3 architecture orchestrates multiple components that work together:V3 Core
The main orchestrator that manages browser lifecycle, handles method routing, and coordinates between all components.
Handlers
Specialized classes (ActHandler, ExtractHandler, ObserveHandler) that translate user instructions into browser actions.
Context & Pages
Manages CDP connections, frame trees, and page lifecycle across both local Chrome and Browserbase.
Cache System
Self-healing cache that replays successful actions without LLM calls.
The AI + Code Pipeline
When you call a Stagehand method, here’s what happens:1. Instruction Processing
You provide a natural language instruction:2. DOM Snapshot Capture
Stagehand captures a hybrid accessibility tree that combines:- Semantic structure from the accessibility tree
- Interactive elements from the DOM
- Shadow DOM piercing to access elements inside web components
v3.ts:155-158
3. LLM Inference
The instruction and DOM snapshot are sent to the LLM with a carefully crafted prompt:prompt.ts:150-169
- Element selector (XPath)
- Action method (click, type, scroll, etc.)
- Arguments (text to type, keys to press, etc.)
4. Deterministic Execution
Once the LLM identifies the action, Stagehand executes it using deterministic browser control:actHandler.ts:191-197
AI makes decisions about what to do
Code executes actions deterministically via CDP
5. Self-Healing
If an element has moved or changed, Stagehand can automatically adapt:- Initial attempt using the cached selector fails
- Re-capture the current DOM state
- Diff the trees to find where the element moved
- Update the selector and retry
- Update the cache with the new selector
actCache.ts:261-269
Browser Connection Modes
Stagehand supports two execution environments:- Local Chrome
- Browserbase
For development and testingStagehand launches and controls a local Chrome instance via chrome-launcher. Perfect for:
- Local development
- Debugging with visible browser
- Testing on your machine
v3.ts:717-873CDP Connection Management
All browser control flows through a singleCdpConnection managed by V3Context:
context.ts:153-172
V3Context handles target lifecycle, frame events, and OOPIF (out-of-process iframes) automatically, so you don’t have to think about it.
Handler Architecture
Each Stagehand method is backed by a specialized handler:| Handler | Purpose | Returns |
|---|---|---|
| ActHandler | Performs actions (click, type, etc.) | ActResult with success status and executed actions |
| ExtractHandler | Extracts data from the page | Structured data matching your schema |
| ObserveHandler | Finds actionable elements | Array of Action objects |
| V3AgentHandler | Multi-step autonomous execution | AgentResult with full execution history |
- Accepts high-level instructions
- Captures DOM snapshots
- Queries the LLM
- Executes deterministic actions
- Reports metrics and results
Event Bus System
Stagehand uses an EventEmitter for internal communication:v3.ts:155
- Screenshot capture events during agent execution
- Page lifecycle notifications
- Error propagation across components
- Plugin hooks (future feature)
Metrics & Observability
Stagehand tracks detailed metrics for every LLM call:v3.ts:241-267
Key Design Principles
AI for Intelligence, Code for Reliability
AI for Intelligence, Code for Reliability
The LLM identifies elements and plans actions, but all browser control uses deterministic CDP commands. This gives you the best of both worlds: adaptability from AI, reliability from code.
Cache-First Execution
Cache-First Execution
Successful actions are cached and replayed without LLM calls. The cache self-heals when pages change, providing speed and reliability.
Unified API Surface
Unified API Surface
Whether you’re running locally or on Browserbase, the API stays the same. The V3 class abstracts away all environment differences.
Observable & Debuggable
Observable & Debuggable
Every action is logged, every metric is tracked, and session recordings are available. You always know what Stagehand is doing.
Next Steps
Write Effective AI Rules
Learn how to guide the AI with clear instructions
Understand Browser Contexts
Master pages, frames, and context management
Leverage Caching
Speed up execution with self-healing cache
See Examples
Explore real-world usage patterns