Overview
Theagent() method creates an autonomous agent that can perform multi-step browser automation tasks. Agents can navigate websites, interact with elements, extract data, and make decisions to complete complex workflows.
Method Signature
Parameters
Optional configuration for the agent.
Agent Instance Methods
execute()
Executes the agent with a given instruction.The task instruction (string) or full options object.
Return Value
Returns aPromise<AgentResult>:
Usage Examples
Basic Agent Task
With Custom Model
Streaming Mode
With Custom Output Schema
Conversation Continuation
With Variables
With Tool Exclusions
With Abort Signal
Hybrid Mode (Coordinate-Based)
CUA Mode (Computer Use Agent)
With Callbacks
Agent Modes
DOM Mode (Default)
Best for structured page interactions. Available tools:act- Semantic actions (click, type)fillForm- Fill form fieldsariaTree- Get accessibility treeextract- Extract datagoto- Navigate to URLscroll- Scroll with semantic directionskeys- Press keyboard keysnavback- Navigate backscreenshot- Take screenshotthink- Agent reasoningwait- Wait for time/conditiondone- Mark task completesearch- Web search (requires BRAVE_API_KEY)
Hybrid Mode
Best for visual/screenshot-based interactions. Available tools:click- Click at coordinatestype- Type at coordinatesdragAndDrop- Drag between pointsclickAndHold- Click and holdfillFormVision- Fill forms using vision- Plus all DOM mode tools
CUA Mode
Uses provider’s native computer use capabilities. Supported models:openai/computer-use-previewanthropic/claude-sonnet-4-5-20250929google/gemini-2.5-computer-use-preview-10-2025- And more - see documentation
Best Practices
-
Clear instructions - Be specific about the goal
-
Set appropriate maxSteps - Prevent runaway executions
-
Use output schemas - Get structured data
-
Handle errors gracefully
-
Use variables for sensitive data
-
Monitor with callbacks
Error Handling
Performance Tips
-
Use faster models for execution
-
Exclude unnecessary tools
-
Set reasonable maxSteps
-
Use conversation continuation - Reuse context