Configuration
Configure browser behavior inagent.toml:
agent.toml
Run Chrome in headless mode (default: true)
Allow JavaScript evaluation via the
evaluate action (default: false)Path to Chrome/Chromium binary. Auto-detected if not set.
Browser Actions
Thebrowser tool supports these actions:
launch
launch
Start the browser. Must be called before any other action.
navigate
navigate
open
open
Open a new tab.Returns the new tab’s target ID.
tabs
tabs
List all open tabs.Returns tab metadata (target ID, title, URL, active state).
focus
focus
Switch to a different tab.
close_tab
close_tab
Close a tab.Omit
target_id to close the active tab.snapshot
snapshot
Get an accessibility tree with element refs.Returns up to 200 interactive elements with refs like
e1, e2, etc.act
act
screenshot
screenshot
Capture the page or an element.Saves to
screenshot_dir with timestamp filename.evaluate
evaluate
Run JavaScript (requires Returns the script’s result as JSON.
evaluate_enabled = true).content
content
Get the page’s HTML.Returns HTML, truncated to 100KB if needed.
close
close
Shut down the browser.
Element Interactions
Elements are addressed by refs (e1, e2, …) from the accessibility tree:
- click
- type
- press_key
- hover
- scroll_into_view
- focus
Accessibility Tree Snapshot
Thesnapshot action returns interactive elements:
Short identifier for use in
act calls (e.g., e1, e2)ARIA role:
button, link, textbox, checkbox, etc.Accessible name (usually visible text or aria-label)
Accessible description (aria-description or title)
Current value for inputs, sliders, etc.
src/tools/browser.rs
Workflow Example
Security
This prevents SSRF attacks. Onlyhttp and https schemes are allowed.
src/tools/browser.rs
JavaScript evaluation is off by default. Enable only for trusted tasks.
Screenshots
Screenshots are saved toscreenshot_dir with timestamped names:
- Viewport
- Full Page
- Element
Performance Notes
Browser state persists
Browser state persists
The browser stays open across multiple tool calls within a worker. Launch once, reuse.
Snapshots are fast
Snapshots are fast
Accessibility tree extraction takes ~100ms. Use liberally to understand the page.
Element refs expire on navigation
Element refs expire on navigation
No sandbox escape
No sandbox escape
The browser runs in a separate process. Even if compromised, it’s isolated from Spacebot.
Debugging
Enable headed mode to watch the browser:Best Practices
Snapshot Before Interact
Always take a snapshot to discover elements. Don’t guess element refs.
Use Names, Not Positions
Find elements by name/role, not by ref number. Refs change between snapshots.
Handle Missing Elements
Check snapshot results before acting. Elements may not exist (dynamic content, slow load).
Close When Done
Call
{"action": "close"} at the end to free resources.