Computer Use Example

Overview

Stagehand supports Computer Use Agents (CUA) that can interact with web pages using computer use APIs from providers like Anthropic Claude and Google Gemini. This enables more sophisticated automation that can understand and interact with complex UIs.

You must configure browser dimensions to use Computer Use. Check out stagehand.config.ts for configuration details.

Basic CUA Example with Google Gemini

Here’s a complete example using Google’s Gemini model:

import { Stagehand } from "@stagehand/core";
import chalk from "chalk";

async function main() {
  console.log(
    `\n${chalk.bold("Stagehand 🤘 Computer Use Agent (CUA) Demo")}\n`,
  );

  // Initialize Stagehand
  const stagehand = new Stagehand({
    env: "LOCAL",
    verbose: 2,
  });
  await stagehand.init();

  try {
    const page = stagehand.context.pages()[0];

    // Create a computer use agent
    const agent = stagehand.agent({
      mode: "cua",
      model: {
        modelName: "google/gemini-3-flash-preview",
        apiKey: process.env.GEMINI_API_KEY ?? process.env.GOOGLE_API_KEY,
      },
      systemPrompt: `You are a helpful assistant that can use a web browser.
      You are currently on the following page: ${page.url()}.
      Do not ask follow up questions, the user will trust your judgement. Today's date is ${new Date().toLocaleDateString()}.`,
    });

    // Navigate to the Browserbase careers page
    await page.goto("https://www.browserbase.com/careers");

    // Define the instruction for the CUA
    const instruction =
      "Apply for the first engineer position with mock data. Don't submit the form. You're on the right page";
    console.log(`Instruction: ${chalk.white(instruction)}`);

    // Execute the instruction
    const result = await agent.execute({
      instruction,
      maxSteps: 20,
    });
    await new Promise((resolve) => setTimeout(resolve, 30000));

    console.log(`${chalk.green("✓")} Execution complete`);
    console.log(`${chalk.yellow("⤷")} Result:`);
    console.log(chalk.white(JSON.stringify(result, null, 2)));
  } catch (error) {
    console.log(`${chalk.red("✗")} Error: ${error}`);
    if (error instanceof Error && error.stack) {
      console.log(chalk.dim(error.stack.split("\n").slice(1).join("\n")));
    }
  } finally {
    // Close the browser
    await stagehand.close();
  }
}

main().catch((error) => {
  console.log(`${chalk.red("✗")} Unhandled error in main function`);
  console.log(chalk.red(error));
});

CUA with Anthropic Claude

You can also use Anthropic’s Claude models with computer use:

import { Stagehand } from "@stagehand/core";
import chalk from "chalk";

async function main() {
  const stagehand = new Stagehand({
    env: "LOCAL",
    verbose: 2,
    experimental: true,
  });
  await stagehand.init();

  const page = stagehand.context.pages()[0];

  const agent = stagehand.agent({
    mode: "cua",
    model: {
      modelName: "anthropic/claude-sonnet-4-5-20250929",
      apiKey: process.env.ANTHROPIC_API_KEY,
    },
    systemPrompt: `You are a helpful assistant that can use a web browser.
    You are currently on the following page: ${page.url()}.
    Do not ask follow up questions, the user will trust your judgement. Today's date is ${new Date().toLocaleDateString()}.`,
  });

  await page.goto("https://www.google.com");

  const result = await agent.execute({
    instruction: "Search for 'Stagehand browser automation' and click the first result",
    maxSteps: 20,
  });

  console.log("Result:", result);
  await stagehand.close();
}

main();

CUA with Custom Tools

You can combine Computer Use with custom tools:

import { z } from "zod";
import { tool } from "ai";
import { Stagehand } from "@stagehand/core";
import chalk from "chalk";

const getWeather = tool({
  description: "Get the current weather in a location",
  inputSchema: z.object({
    location: z.string().describe("The location to get weather for"),
  }),
  execute: async ({ location }) => {
    // Your custom logic here
    return {
      location,
      temperature: 70,
      conditions: "sunny",
    };
  },
});

async function main() {
  const stagehand = new Stagehand({
    env: "LOCAL",
    verbose: 2,
    experimental: true,
    model: "anthropic/claude-sonnet-4-5",
  });
  await stagehand.init();

  const page = stagehand.context.pages()[0];

  const agent = stagehand.agent({
    mode: "cua",
    model: {
      modelName: "anthropic/claude-sonnet-4-5-20250929",
      apiKey: process.env.ANTHROPIC_API_KEY,
    },
    systemPrompt: `You are a helpful assistant that can use a web browser.
    You are currently on the following page: ${page.url()}.
    Do not ask follow up questions, the user will trust your judgement. Today's date is ${new Date().toLocaleDateString()}.`,
    tools: {
      getWeather,
    },
  });

  await page.goto("https://www.google.com");

  const result = await agent.execute({
    instruction: "What's the weather in San Francisco? Then search for outdoor activities there.",
    maxSteps: 20,
  });

  console.log("Result:", result);
  await stagehand.close();
}

main();

Agent Configuration

mode: “cua”

Set the agent mode to "cua" to enable Computer Use:

const agent = stagehand.agent({
  mode: "cua", // Enable Computer Use
  // ... other options
});

Supported Models

CUA works with these models:

Anthropic: anthropic/claude-sonnet-4-5-20250929, anthropic/claude-sonnet-4-5
Google: google/gemini-3-flash-preview, google/gemini-2.0-flash-exp

System Prompt

Provide clear instructions to guide the agent’s behavior:

systemPrompt: `You are a helpful assistant that can use a web browser.
You are currently on the following page: ${page.url()}.
Do not ask follow up questions, the user will trust your judgement.
Today's date is ${new Date().toLocaleDateString()}.`

Max Steps

Control how many actions the agent can take:

const result = await agent.execute({
  instruction: "Your instruction here",
  maxSteps: 20, // Limit to 20 actions
});

Key Concepts

Computer Use vs Standard Mode

Standard mode: Uses Stagehand’s action primitives (act, extract, observe)
CUA mode: Uses computer use APIs to directly control the browser

When to Use CUA

Use Computer Use for:

Complex UI interactions
Tasks requiring visual understanding
Multi-step workflows that benefit from visual context
Scenarios where standard actions are insufficient

Browser Configuration

CUA requires specific browser dimensions. Configure in stagehand.config.ts:

export default {
  browserOptions: {
    width: 1920,
    height: 1080,
  },
};

Error Handling

Always include proper error handling for CUA:

try {
  const result = await agent.execute({
    instruction,
    maxSteps: 20,
  });
  console.log("Success:", result);
} catch (error) {
  console.error("Error:", error);
  if (error instanceof Error && error.stack) {
    console.log(error.stack);
  }
} finally {
  await stagehand.close();
}

Best Practices

Set maxSteps - Prevent infinite loops by limiting agent steps
Clear instructions - Provide specific, actionable instructions
System prompts - Guide agent behavior with detailed system prompts
Error handling - Always handle errors and close browser
Monitor execution - Use verbose logging to debug agent behavior
Configure browser - Ensure correct browser dimensions for CUA

Environment Setup

Local Environment

const stagehand = new Stagehand({
  env: "LOCAL",
  verbose: 2,
});

Browserbase Environment

const stagehand = new Stagehand({
  env: "BROWSERBASE",
  apiKey: process.env.BROWSERBASE_API_KEY,
  projectId: process.env.BROWSERBASE_PROJECT_ID,
});

Next Steps

Learn about agent with custom tools to extend capabilities
See streaming agent for real-time feedback
Explore MCP integration for external services

Basic Examples

Advanced Examples

Computer Use Example

Overview

Basic CUA Example with Google Gemini

CUA with Anthropic Claude

CUA with Custom Tools

Agent Configuration

mode: “cua”

Supported Models

System Prompt

Max Steps

Key Concepts

Computer Use vs Standard Mode

When to Use CUA

Browser Configuration

Error Handling

Best Practices

Environment Setup

Local Environment

Browserbase Environment

Next Steps

Build docs developers (and LLMs) love

Basic Examples

Advanced Examples

​Overview

​Basic CUA Example with Google Gemini

​CUA with Anthropic Claude

​CUA with Custom Tools

​Agent Configuration

​mode: “cua”

​Supported Models

​System Prompt

​Max Steps

​Key Concepts

​Computer Use vs Standard Mode

​When to Use CUA

​Browser Configuration

​Error Handling

​Best Practices

​Environment Setup

​Local Environment

​Browserbase Environment

​Next Steps

Build docs developers (and LLMs) love

Overview

Basic CUA Example with Google Gemini

CUA with Anthropic Claude

CUA with Custom Tools

Agent Configuration

mode: “cua”

Supported Models

System Prompt

Max Steps

Key Concepts

Computer Use vs Standard Mode

When to Use CUA

Browser Configuration

Error Handling

Best Practices

Environment Setup

Local Environment

Browserbase Environment

Next Steps