Skip to main content

Overview

Stagehand supports Computer Use Agents (CUA) that can interact with web pages using computer use APIs from providers like Anthropic Claude and Google Gemini. This enables more sophisticated automation that can understand and interact with complex UIs.
You must configure browser dimensions to use Computer Use. Check out stagehand.config.ts for configuration details.

Basic CUA Example with Google Gemini

Here’s a complete example using Google’s Gemini model:
import { Stagehand } from "@stagehand/core";
import chalk from "chalk";

async function main() {
  console.log(
    `\n${chalk.bold("Stagehand 🤘 Computer Use Agent (CUA) Demo")}\n`,
  );

  // Initialize Stagehand
  const stagehand = new Stagehand({
    env: "LOCAL",
    verbose: 2,
  });
  await stagehand.init();

  try {
    const page = stagehand.context.pages()[0];

    // Create a computer use agent
    const agent = stagehand.agent({
      mode: "cua",
      model: {
        modelName: "google/gemini-3-flash-preview",
        apiKey: process.env.GEMINI_API_KEY ?? process.env.GOOGLE_API_KEY,
      },
      systemPrompt: `You are a helpful assistant that can use a web browser.
      You are currently on the following page: ${page.url()}.
      Do not ask follow up questions, the user will trust your judgement. Today's date is ${new Date().toLocaleDateString()}.`,
    });

    // Navigate to the Browserbase careers page
    await page.goto("https://www.browserbase.com/careers");

    // Define the instruction for the CUA
    const instruction =
      "Apply for the first engineer position with mock data. Don't submit the form. You're on the right page";
    console.log(`Instruction: ${chalk.white(instruction)}`);

    // Execute the instruction
    const result = await agent.execute({
      instruction,
      maxSteps: 20,
    });
    await new Promise((resolve) => setTimeout(resolve, 30000));

    console.log(`${chalk.green("✓")} Execution complete`);
    console.log(`${chalk.yellow("⤷")} Result:`);
    console.log(chalk.white(JSON.stringify(result, null, 2)));
  } catch (error) {
    console.log(`${chalk.red("✗")} Error: ${error}`);
    if (error instanceof Error && error.stack) {
      console.log(chalk.dim(error.stack.split("\n").slice(1).join("\n")));
    }
  } finally {
    // Close the browser
    await stagehand.close();
  }
}

main().catch((error) => {
  console.log(`${chalk.red("✗")} Unhandled error in main function`);
  console.log(chalk.red(error));
});

CUA with Anthropic Claude

You can also use Anthropic’s Claude models with computer use:
import { Stagehand } from "@stagehand/core";
import chalk from "chalk";

async function main() {
  const stagehand = new Stagehand({
    env: "LOCAL",
    verbose: 2,
    experimental: true,
  });
  await stagehand.init();

  const page = stagehand.context.pages()[0];

  const agent = stagehand.agent({
    mode: "cua",
    model: {
      modelName: "anthropic/claude-sonnet-4-5-20250929",
      apiKey: process.env.ANTHROPIC_API_KEY,
    },
    systemPrompt: `You are a helpful assistant that can use a web browser.
    You are currently on the following page: ${page.url()}.
    Do not ask follow up questions, the user will trust your judgement. Today's date is ${new Date().toLocaleDateString()}.`,
  });

  await page.goto("https://www.google.com");

  const result = await agent.execute({
    instruction: "Search for 'Stagehand browser automation' and click the first result",
    maxSteps: 20,
  });

  console.log("Result:", result);
  await stagehand.close();
}

main();

CUA with Custom Tools

You can combine Computer Use with custom tools:
import { z } from "zod";
import { tool } from "ai";
import { Stagehand } from "@stagehand/core";
import chalk from "chalk";

const getWeather = tool({
  description: "Get the current weather in a location",
  inputSchema: z.object({
    location: z.string().describe("The location to get weather for"),
  }),
  execute: async ({ location }) => {
    // Your custom logic here
    return {
      location,
      temperature: 70,
      conditions: "sunny",
    };
  },
});

async function main() {
  const stagehand = new Stagehand({
    env: "LOCAL",
    verbose: 2,
    experimental: true,
    model: "anthropic/claude-sonnet-4-5",
  });
  await stagehand.init();

  const page = stagehand.context.pages()[0];

  const agent = stagehand.agent({
    mode: "cua",
    model: {
      modelName: "anthropic/claude-sonnet-4-5-20250929",
      apiKey: process.env.ANTHROPIC_API_KEY,
    },
    systemPrompt: `You are a helpful assistant that can use a web browser.
    You are currently on the following page: ${page.url()}.
    Do not ask follow up questions, the user will trust your judgement. Today's date is ${new Date().toLocaleDateString()}.`,
    tools: {
      getWeather,
    },
  });

  await page.goto("https://www.google.com");

  const result = await agent.execute({
    instruction: "What's the weather in San Francisco? Then search for outdoor activities there.",
    maxSteps: 20,
  });

  console.log("Result:", result);
  await stagehand.close();
}

main();

Agent Configuration

mode: “cua”

Set the agent mode to "cua" to enable Computer Use:
const agent = stagehand.agent({
  mode: "cua", // Enable Computer Use
  // ... other options
});

Supported Models

CUA works with these models:
  • Anthropic: anthropic/claude-sonnet-4-5-20250929, anthropic/claude-sonnet-4-5
  • Google: google/gemini-3-flash-preview, google/gemini-2.0-flash-exp

System Prompt

Provide clear instructions to guide the agent’s behavior:
systemPrompt: `You are a helpful assistant that can use a web browser.
You are currently on the following page: ${page.url()}.
Do not ask follow up questions, the user will trust your judgement.
Today's date is ${new Date().toLocaleDateString()}.`

Max Steps

Control how many actions the agent can take:
const result = await agent.execute({
  instruction: "Your instruction here",
  maxSteps: 20, // Limit to 20 actions
});

Key Concepts

Computer Use vs Standard Mode

  • Standard mode: Uses Stagehand’s action primitives (act, extract, observe)
  • CUA mode: Uses computer use APIs to directly control the browser

When to Use CUA

Use Computer Use for:
  • Complex UI interactions
  • Tasks requiring visual understanding
  • Multi-step workflows that benefit from visual context
  • Scenarios where standard actions are insufficient

Browser Configuration

CUA requires specific browser dimensions. Configure in stagehand.config.ts:
export default {
  browserOptions: {
    width: 1920,
    height: 1080,
  },
};

Error Handling

Always include proper error handling for CUA:
try {
  const result = await agent.execute({
    instruction,
    maxSteps: 20,
  });
  console.log("Success:", result);
} catch (error) {
  console.error("Error:", error);
  if (error instanceof Error && error.stack) {
    console.log(error.stack);
  }
} finally {
  await stagehand.close();
}

Best Practices

  1. Set maxSteps - Prevent infinite loops by limiting agent steps
  2. Clear instructions - Provide specific, actionable instructions
  3. System prompts - Guide agent behavior with detailed system prompts
  4. Error handling - Always handle errors and close browser
  5. Monitor execution - Use verbose logging to debug agent behavior
  6. Configure browser - Ensure correct browser dimensions for CUA

Environment Setup

Local Environment

const stagehand = new Stagehand({
  env: "LOCAL",
  verbose: 2,
});

Browserbase Environment

const stagehand = new Stagehand({
  env: "BROWSERBASE",
  apiKey: process.env.BROWSERBASE_API_KEY,
  projectId: process.env.BROWSERBASE_PROJECT_ID,
});

Next Steps

Build docs developers (and LLMs) love