BrowserEnv
Unified browser automation environment supporting both DOM-based (natural language) and CUA-based (vision + coordinates) control.
Overview
BrowserEnv provides two distinct browser automation modes:
- DOM mode: Natural language operations via Stagehand SDK (
act, observe, extract, navigate)
- CUA mode: Vision-based primitives (
click, scroll, type_text, screenshot)
Both modes integrate with Browserbase for cloud browser management and support local execution.
Installation
Install with browser support:
uv add 'verifiers[browser]'
Or when developing in the verifiers repo:
See Browser Examples for complete setup and usage.
Inheritance
Environment
└── MultiTurnEnv
└── ToolEnv
└── StatefulToolEnv
└── BrowserEnv
Constructor
BrowserEnv(
mode: Literal["dom", "cua"] = "dom",
# Shared config
project_id: str | None = None,
browserbase_api_key_var: str = "BROWSERBASE_API_KEY",
# DOM mode specific
model_api_key_var: str = "MODEL_API_KEY",
stagehand_model: str = "openai/gpt-4o-mini",
proxy_model_to_stagehand: bool = False,
# CUA mode specific
use_sandbox: bool = True,
server_url: str = "http://localhost:3000",
env: Literal["LOCAL", "BROWSERBASE"] = "BROWSERBASE",
viewport_width: int = 1024,
viewport_height: int = 768,
save_screenshots: bool = True,
keep_recent_screenshots: int | None = 2,
proxies: bool = False,
advanced_stealth: bool = False,
# CUA sandbox mode specific
server_port: int = 3000,
server_ready_timeout: int = 120,
server_ready_poll_interval: float = 2.0,
docker_image: str = "node:18-slim",
cpu_cores: int = 2,
memory_gb: int = 4,
disk_size_gb: int = 10,
sandbox_timeout_minutes: int = 60,
sandbox_timeout_per_command_seconds: int = 60,
use_binary: bool = True,
# Pre-built image configuration
use_prebuilt_image: bool = True,
prebuilt_image: str = "deepdream19/cua-server:latest",
# Error handling
stop_errors: list[type[Exception]] | None = None,
**kwargs
)
Parameters
Mode Selection
mode
Literal['dom', 'cua']
default:"dom"
Operating mode:
"dom": Natural language browser control via Stagehand SDK
"cua": Vision-based control using coordinate primitives
Shared Configuration
Browserbase project ID. Required when using Browserbase.
browserbase_api_key_var
str
default:"BROWSERBASE_API_KEY"
Environment variable name for Browserbase API key.
DOM Mode Parameters
model_api_key_var
str
default:"MODEL_API_KEY"
Environment variable name for model API key (OpenAI, Anthropic, etc.).
stagehand_model
str
default:"openai/gpt-4o-mini"
Model used by Stagehand for DOM understanding and action planning.
Whether to proxy model API calls through Stagehand.
CUA Mode Parameters
Auto-deploy CUA server to sandbox. If False, connects to server_url.
server_url
str
default:"http://localhost:3000"
CUA server URL when use_sandbox=False.
env
Literal['LOCAL', 'BROWSERBASE']
default:"BROWSERBASE"
Browser execution environment:
"BROWSERBASE": Cloud browsers via Browserbase
"LOCAL": Local browser execution
Browser viewport width in pixels.
Browser viewport height in pixels.
Save screenshots to disk during execution.
Number of recent screenshots to keep in message context. Set to None to keep all.
Enable Browserbase proxies for IP rotation.
Enable Browserbase Advanced Stealth mode for anti-bot detection.
CUA Sandbox Configuration
Port for CUA server in sandbox.
Timeout in seconds waiting for sandbox server to be ready.
server_ready_poll_interval
Poll interval in seconds for sandbox server health checks.
docker_image
str
default:"node:18-slim"
Docker image for sandbox (only used when use_prebuilt_image=False).
CPU cores allocated to sandbox.
Memory in GB allocated to sandbox.
Disk size in GB for sandbox.
Sandbox timeout in minutes.
sandbox_timeout_per_command_seconds
Per-command timeout in sandbox.
Use pre-built SEA binary when use_prebuilt_image=False. If False, installs from npm.
Pre-built Image Configuration
Use pre-built Docker image for fastest startup. Recommended for production.
prebuilt_image
str
default:"deepdream19/cua-server:latest"
Docker image to use when use_prebuilt_image=True.
Error Handling
stop_errors
list[type[Exception]] | None
default:"None"
Exception types that should trigger cleanup. Defaults to [vf.SandboxError].
navigate
def navigate(url: str) -> str
Navigate to a URL.
act
def act(instruction: str) -> str
Perform an action described in natural language (e.g., “click the login button”, “fill in the email field with [email protected]”).
observe
def observe(instruction: str) -> str
Find elements or information matching the instruction (e.g., “find all product cards”, “locate the search bar”).
def extract(instruction: str, schema_json: str) -> str
Extract structured data from the page according to a JSON schema.
click
def click(x: int, y: int) -> str
Click at coordinates (x, y).
type_text
def type_text(text: str) -> str
Type text at the current cursor position.
def scroll(direction: str, amount: int = 500) -> str
Scroll the page. Direction can be “up” or “down”.
screenshot
Capture a screenshot of the current page. Returns path to screenshot file.
Key Methods
setup_state
async def setup_state(
state: vf.State,
**kwargs
) -> vf.State
Initialize browser session for this rollout. Delegates to mode-specific implementation (DOM or CUA).
def update_tool_args(
tool_name: str,
tool_args: dict[str, Any],
messages: vf.Messages,
state: vf.State,
**kwargs
) -> dict[str, Any]
Inject session state into tool calls. Delegates to mode-specific implementation.
get_prompt_messages
async def get_prompt_messages(
state: vf.State
) -> vf.Messages
Get prompt messages. In CUA mode, filters screenshots to keep only recent ones based on keep_recent_screenshots.
cleanup_session
@vf.cleanup
async def cleanup_session(state: vf.State) -> None
Clean up browser session after rollout.
teardown
@vf.teardown
async def teardown() -> None
Clean up environment resources (e.g., sandbox servers in CUA mode).
Example Usage
DOM Mode
import verifiers as vf
from verifiers.envs.integrations.browser_env import BrowserEnv
from datasets import Dataset
def load_environment(
project_id: str,
max_turns: int = 10,
):
dataset = Dataset.from_dict({
"question": ["What is the headline on primeintellect.ai?"],
"answer": ["The Open Superintelligence Stack"],
"start_url": ["https://primeintellect.ai"],
})
def check_answer(completion: vf.Messages, answer: str) -> float:
text = str(completion).lower()
return 1.0 if answer.lower() in text else 0.0
return BrowserEnv(
mode="dom",
project_id=project_id,
dataset=dataset,
rubric=vf.Rubric(check_answer),
max_turns=max_turns,
system_prompt="Use browser tools to find information on websites.",
)
CUA Mode with Sandbox (Default)
import verifiers as vf
from verifiers.envs.integrations.browser_env import BrowserEnv
from datasets import Dataset
def load_environment(
project_id: str,
max_turns: int = 15,
):
dataset = Dataset.from_dict({
"question": ["Click the 'Get Started' button"],
"start_url": ["https://example.com"],
})
return BrowserEnv(
mode="cua",
project_id=project_id,
dataset=dataset,
rubric=vf.Rubric(lambda completion: 1.0),
max_turns=max_turns,
# CUA sandbox is automatic by default
use_sandbox=True,
use_prebuilt_image=True, # Fastest startup
system_prompt="Use vision and coordinates to control the browser.",
)
CUA Mode with Local Server
import verifiers as vf
from verifiers.envs.integrations.browser_env import BrowserEnv
from datasets import Dataset
def load_environment(
project_id: str,
):
dataset = Dataset.from_dict({
"question": ["Find and click the search button"],
"start_url": ["https://example.com"],
})
return BrowserEnv(
mode="cua",
project_id=project_id,
use_sandbox=False, # Use local server
server_url="http://localhost:3000",
dataset=dataset,
rubric=vf.Rubric(lambda completion: 1.0),
system_prompt="Control browser using screenshots and coordinates.",
)
CUA Mode Execution Options
CUA mode supports three execution strategies (from fastest to most flexible):
1. Pre-built Docker Image (Default, Recommended)
env = BrowserEnv(
mode="cua",
use_prebuilt_image=True, # Default
prebuilt_image="deepdream19/cua-server:latest",
)
- Fastest startup (no binary upload or npm install)
- Uses pre-built
deepdream19/cua-server:latest image
- Best for production and rapid iteration
2. Binary Upload
env = BrowserEnv(
mode="cua",
use_prebuilt_image=False,
use_binary=True, # Default when use_prebuilt_image=False
)
- Builds/uploads SEA binary to sandbox
- Useful for custom server versions
- Slower startup than pre-built image
3. Local Server
env = BrowserEnv(
mode="cua",
use_sandbox=False,
server_url="http://localhost:3000",
)
- Connect to manually started CUA server
- Useful for local development and debugging
- Requires running
npm start in assets/templates/browserbase/cua/
Screenshot Management (CUA Mode)
CUA mode automatically manages screenshots in the message history:
save_screenshots=True: Screenshots saved to disk
keep_recent_screenshots=2: Only 2 most recent screenshots kept in context
- Older screenshots filtered out via
get_prompt_messages() to reduce token usage
env = BrowserEnv(
mode="cua",
save_screenshots=True,
keep_recent_screenshots=3, # Keep last 3 screenshots
)
See Also