BrowserEnv

Unified browser automation environment supporting both DOM-based (natural language) and CUA-based (vision + coordinates) control.

Overview

BrowserEnv provides two distinct browser automation modes:

DOM mode: Natural language operations via Stagehand SDK (act, observe, extract, navigate)
CUA mode: Vision-based primitives (click, scroll, type_text, screenshot)

Both modes integrate with Browserbase for cloud browser management and support local execution.

Installation

Install with browser support:

uv add 'verifiers[browser]'

Or when developing in the verifiers repo:

uv sync --extra browser

See Browser Examples for complete setup and usage.

Inheritance

Environment
└── MultiTurnEnv
    └── ToolEnv
        └── StatefulToolEnv
            └── BrowserEnv

Constructor

BrowserEnv(
    mode: Literal["dom", "cua"] = "dom",
    # Shared config
    project_id: str | None = None,
    browserbase_api_key_var: str = "BROWSERBASE_API_KEY",
    # DOM mode specific
    model_api_key_var: str = "MODEL_API_KEY",
    stagehand_model: str = "openai/gpt-4o-mini",
    proxy_model_to_stagehand: bool = False,
    # CUA mode specific
    use_sandbox: bool = True,
    server_url: str = "http://localhost:3000",
    env: Literal["LOCAL", "BROWSERBASE"] = "BROWSERBASE",
    viewport_width: int = 1024,
    viewport_height: int = 768,
    save_screenshots: bool = True,
    keep_recent_screenshots: int | None = 2,
    proxies: bool = False,
    advanced_stealth: bool = False,
    # CUA sandbox mode specific
    server_port: int = 3000,
    server_ready_timeout: int = 120,
    server_ready_poll_interval: float = 2.0,
    docker_image: str = "node:18-slim",
    cpu_cores: int = 2,
    memory_gb: int = 4,
    disk_size_gb: int = 10,
    sandbox_timeout_minutes: int = 60,
    sandbox_timeout_per_command_seconds: int = 60,
    use_binary: bool = True,
    # Pre-built image configuration
    use_prebuilt_image: bool = True,
    prebuilt_image: str = "deepdream19/cua-server:latest",
    # Error handling
    stop_errors: list[type[Exception]] | None = None,
    **kwargs
)

Parameters

Mode Selection

mode

Literal['dom', 'cua']

default:"dom"

Operating mode:

"dom": Natural language browser control via Stagehand SDK
"cua": Vision-based control using coordinate primitives

Shared Configuration

project_id

str | None

default:"None"

Browserbase project ID. Required when using Browserbase.

browserbase_api_key_var

str

default:"BROWSERBASE_API_KEY"

Environment variable name for Browserbase API key.

DOM Mode Parameters

model_api_key_var

str

default:"MODEL_API_KEY"

Environment variable name for model API key (OpenAI, Anthropic, etc.).

stagehand_model

str

default:"openai/gpt-4o-mini"

Model used by Stagehand for DOM understanding and action planning.

proxy_model_to_stagehand

bool

default:"False"

Whether to proxy model API calls through Stagehand.

CUA Mode Parameters

use_sandbox

bool

default:"True"

Auto-deploy CUA server to sandbox. If False, connects to server_url.

server_url

str

default:"http://localhost:3000"

CUA server URL when use_sandbox=False.

env

Literal['LOCAL', 'BROWSERBASE']

default:"BROWSERBASE"

Browser execution environment:

"BROWSERBASE": Cloud browsers via Browserbase
"LOCAL": Local browser execution

viewport_width

int

default:"1024"

Browser viewport width in pixels.

viewport_height

int

default:"768"

Browser viewport height in pixels.

save_screenshots

bool

default:"True"

Save screenshots to disk during execution.

keep_recent_screenshots

int | None

default:"2"

Number of recent screenshots to keep in message context. Set to None to keep all.

proxies

bool

default:"False"

Enable Browserbase proxies for IP rotation.

advanced_stealth

bool

default:"False"

Enable Browserbase Advanced Stealth mode for anti-bot detection.

CUA Sandbox Configuration

server_port

int

default:"3000"

Port for CUA server in sandbox.

server_ready_timeout

int

default:"120"

Timeout in seconds waiting for sandbox server to be ready.

server_ready_poll_interval

float

default:"2.0"

Poll interval in seconds for sandbox server health checks.

docker_image

str

default:"node:18-slim"

Docker image for sandbox (only used when use_prebuilt_image=False).

cpu_cores

int

default:"2"

CPU cores allocated to sandbox.

memory_gb

int

default:"4"

Memory in GB allocated to sandbox.

disk_size_gb

int

default:"10"

Disk size in GB for sandbox.

sandbox_timeout_minutes

int

default:"60"

Sandbox timeout in minutes.

sandbox_timeout_per_command_seconds

int

default:"60"

Per-command timeout in sandbox.

use_binary

bool

default:"True"

Use pre-built SEA binary when use_prebuilt_image=False. If False, installs from npm.

Pre-built Image Configuration

use_prebuilt_image

bool

default:"True"

Use pre-built Docker image for fastest startup. Recommended for production.

prebuilt_image

str

default:"deepdream19/cua-server:latest"

Docker image to use when use_prebuilt_image=True.

Error Handling

stop_errors

list[type[Exception]] | None

default:"None"

Exception types that should trigger cleanup. Defaults to [vf.SandboxError].

**kwargs

Any

Additional arguments passed to StatefulToolEnv.

DOM Mode Tools

navigate

def navigate(url: str) -> str

Navigate to a URL.

act

def act(instruction: str) -> str

Perform an action described in natural language (e.g., “click the login button”, “fill in the email field with [email protected]”).

observe

def observe(instruction: str) -> str

Find elements or information matching the instruction (e.g., “find all product cards”, “locate the search bar”).

extract

def extract(instruction: str, schema_json: str) -> str

Extract structured data from the page according to a JSON schema.

CUA Mode Tools

click

def click(x: int, y: int) -> str

Click at coordinates (x, y).

type_text

def type_text(text: str) -> str

Type text at the current cursor position.

scroll

def scroll(direction: str, amount: int = 500) -> str

Scroll the page. Direction can be “up” or “down”.

screenshot

def screenshot() -> str

Capture a screenshot of the current page. Returns path to screenshot file.

Key Methods

setup_state

async def setup_state(
    state: vf.State,
    **kwargs
) -> vf.State

Initialize browser session for this rollout. Delegates to mode-specific implementation (DOM or CUA).

update_tool_args

def update_tool_args(
    tool_name: str,
    tool_args: dict[str, Any],
    messages: vf.Messages,
    state: vf.State,
    **kwargs
) -> dict[str, Any]

Inject session state into tool calls. Delegates to mode-specific implementation.

get_prompt_messages

async def get_prompt_messages(
    state: vf.State
) -> vf.Messages

Get prompt messages. In CUA mode, filters screenshots to keep only recent ones based on keep_recent_screenshots.

cleanup_session

@vf.cleanup
async def cleanup_session(state: vf.State) -> None

Clean up browser session after rollout.

teardown

@vf.teardown
async def teardown() -> None

Clean up environment resources (e.g., sandbox servers in CUA mode).

Example Usage

DOM Mode

import verifiers as vf
from verifiers.envs.integrations.browser_env import BrowserEnv
from datasets import Dataset

def load_environment(
    project_id: str,
    max_turns: int = 10,
):
    dataset = Dataset.from_dict({
        "question": ["What is the headline on primeintellect.ai?"],
        "answer": ["The Open Superintelligence Stack"],
        "start_url": ["https://primeintellect.ai"],
    })
    
    def check_answer(completion: vf.Messages, answer: str) -> float:
        text = str(completion).lower()
        return 1.0 if answer.lower() in text else 0.0
    
    return BrowserEnv(
        mode="dom",
        project_id=project_id,
        dataset=dataset,
        rubric=vf.Rubric(check_answer),
        max_turns=max_turns,
        system_prompt="Use browser tools to find information on websites.",
    )

CUA Mode with Sandbox (Default)

import verifiers as vf
from verifiers.envs.integrations.browser_env import BrowserEnv
from datasets import Dataset

def load_environment(
    project_id: str,
    max_turns: int = 15,
):
    dataset = Dataset.from_dict({
        "question": ["Click the 'Get Started' button"],
        "start_url": ["https://example.com"],
    })
    
    return BrowserEnv(
        mode="cua",
        project_id=project_id,
        dataset=dataset,
        rubric=vf.Rubric(lambda completion: 1.0),
        max_turns=max_turns,
        # CUA sandbox is automatic by default
        use_sandbox=True,
        use_prebuilt_image=True,  # Fastest startup
        system_prompt="Use vision and coordinates to control the browser.",
    )

CUA Mode with Local Server

import verifiers as vf
from verifiers.envs.integrations.browser_env import BrowserEnv
from datasets import Dataset

def load_environment(
    project_id: str,
):
    dataset = Dataset.from_dict({
        "question": ["Find and click the search button"],
        "start_url": ["https://example.com"],
    })
    
    return BrowserEnv(
        mode="cua",
        project_id=project_id,
        use_sandbox=False,  # Use local server
        server_url="http://localhost:3000",
        dataset=dataset,
        rubric=vf.Rubric(lambda completion: 1.0),
        system_prompt="Control browser using screenshots and coordinates.",
    )

CUA Mode Execution Options

CUA mode supports three execution strategies (from fastest to most flexible):

1. Pre-built Docker Image (Default, Recommended)

env = BrowserEnv(
    mode="cua",
    use_prebuilt_image=True,  # Default
    prebuilt_image="deepdream19/cua-server:latest",
)

Fastest startup (no binary upload or npm install)
Uses pre-built deepdream19/cua-server:latest image
Best for production and rapid iteration

2. Binary Upload

env = BrowserEnv(
    mode="cua",
    use_prebuilt_image=False,
    use_binary=True,  # Default when use_prebuilt_image=False
)

Builds/uploads SEA binary to sandbox
Useful for custom server versions
Slower startup than pre-built image

3. Local Server

env = BrowserEnv(
    mode="cua",
    use_sandbox=False,
    server_url="http://localhost:3000",
)

Connect to manually started CUA server
Useful for local development and debugging
Requires running npm start in assets/templates/browserbase/cua/

Screenshot Management (CUA Mode)

CUA mode automatically manages screenshots in the message history:

save_screenshots=True: Screenshots saved to disk
keep_recent_screenshots=2: Only 2 most recent screenshots kept in context
Older screenshots filtered out via get_prompt_messages() to reduce token usage

env = BrowserEnv(
    mode="cua",
    save_screenshots=True,
    keep_recent_screenshots=3,  # Keep last 3 screenshots
)

Environment Classes

Rubrics & Parsers

Clients

Integration Classes

Experimental

Data Types

Utilities

​BrowserEnv

​Overview

​Installation

​Inheritance

​Constructor

​Parameters

​Mode Selection

​Shared Configuration

​DOM Mode Parameters

​CUA Mode Parameters

​CUA Sandbox Configuration

​Pre-built Image Configuration

​Error Handling

​DOM Mode Tools

​navigate

​act

​observe

​extract

​CUA Mode Tools

​click

​type_text

​scroll

​screenshot

​Key Methods

​setup_state

​update_tool_args

​get_prompt_messages

​cleanup_session

​teardown

​Example Usage

​DOM Mode

​CUA Mode with Sandbox (Default)

​CUA Mode with Local Server

​CUA Mode Execution Options

​1. Pre-built Docker Image (Default, Recommended)

​2. Binary Upload

​3. Local Server

​Screenshot Management (CUA Mode)

​See Also

Build docs developers (and LLMs) love

BrowserEnv

Overview

Installation

Inheritance

Constructor

Parameters

Mode Selection

Shared Configuration

DOM Mode Parameters

CUA Mode Parameters

CUA Sandbox Configuration

Pre-built Image Configuration

Error Handling

DOM Mode Tools

navigate

act

observe

extract

CUA Mode Tools

click

type_text

scroll

screenshot

Key Methods

setup_state

update_tool_args

get_prompt_messages

cleanup_session

teardown

Example Usage

DOM Mode

CUA Mode with Sandbox (Default)

CUA Mode with Local Server

CUA Mode Execution Options

1. Pre-built Docker Image (Default, Recommended)

2. Binary Upload

3. Local Server

Screenshot Management (CUA Mode)

See Also