Overview
Stagehand supports Computer Use Agents (CUA) that can interact with web pages using computer use APIs from providers like Anthropic Claude and Google Gemini. This enables more sophisticated automation that can understand and interact with complex UIs.
You must configure browser dimensions to use Computer Use. Check out stagehand.config.ts for configuration details.
Basic CUA Example with Google Gemini
Here’s a complete example using Google’s Gemini model:
import { Stagehand } from "@stagehand/core";
import chalk from "chalk";
async function main() {
console.log(
`\n${chalk.bold("Stagehand 🤘 Computer Use Agent (CUA) Demo")}\n`,
);
// Initialize Stagehand
const stagehand = new Stagehand({
env: "LOCAL",
verbose: 2,
});
await stagehand.init();
try {
const page = stagehand.context.pages()[0];
// Create a computer use agent
const agent = stagehand.agent({
mode: "cua",
model: {
modelName: "google/gemini-3-flash-preview",
apiKey: process.env.GEMINI_API_KEY ?? process.env.GOOGLE_API_KEY,
},
systemPrompt: `You are a helpful assistant that can use a web browser.
You are currently on the following page: ${page.url()}.
Do not ask follow up questions, the user will trust your judgement. Today's date is ${new Date().toLocaleDateString()}.`,
});
// Navigate to the Browserbase careers page
await page.goto("https://www.browserbase.com/careers");
// Define the instruction for the CUA
const instruction =
"Apply for the first engineer position with mock data. Don't submit the form. You're on the right page";
console.log(`Instruction: ${chalk.white(instruction)}`);
// Execute the instruction
const result = await agent.execute({
instruction,
maxSteps: 20,
});
await new Promise((resolve) => setTimeout(resolve, 30000));
console.log(`${chalk.green("✓")} Execution complete`);
console.log(`${chalk.yellow("⤷")} Result:`);
console.log(chalk.white(JSON.stringify(result, null, 2)));
} catch (error) {
console.log(`${chalk.red("✗")} Error: ${error}`);
if (error instanceof Error && error.stack) {
console.log(chalk.dim(error.stack.split("\n").slice(1).join("\n")));
}
} finally {
// Close the browser
await stagehand.close();
}
}
main().catch((error) => {
console.log(`${chalk.red("✗")} Unhandled error in main function`);
console.log(chalk.red(error));
});
CUA with Anthropic Claude
You can also use Anthropic’s Claude models with computer use:
import { Stagehand } from "@stagehand/core";
import chalk from "chalk";
async function main() {
const stagehand = new Stagehand({
env: "LOCAL",
verbose: 2,
experimental: true,
});
await stagehand.init();
const page = stagehand.context.pages()[0];
const agent = stagehand.agent({
mode: "cua",
model: {
modelName: "anthropic/claude-sonnet-4-5-20250929",
apiKey: process.env.ANTHROPIC_API_KEY,
},
systemPrompt: `You are a helpful assistant that can use a web browser.
You are currently on the following page: ${page.url()}.
Do not ask follow up questions, the user will trust your judgement. Today's date is ${new Date().toLocaleDateString()}.`,
});
await page.goto("https://www.google.com");
const result = await agent.execute({
instruction: "Search for 'Stagehand browser automation' and click the first result",
maxSteps: 20,
});
console.log("Result:", result);
await stagehand.close();
}
main();
CUA with Custom Tools
You can combine Computer Use with custom tools:
import { z } from "zod";
import { tool } from "ai";
import { Stagehand } from "@stagehand/core";
import chalk from "chalk";
const getWeather = tool({
description: "Get the current weather in a location",
inputSchema: z.object({
location: z.string().describe("The location to get weather for"),
}),
execute: async ({ location }) => {
// Your custom logic here
return {
location,
temperature: 70,
conditions: "sunny",
};
},
});
async function main() {
const stagehand = new Stagehand({
env: "LOCAL",
verbose: 2,
experimental: true,
model: "anthropic/claude-sonnet-4-5",
});
await stagehand.init();
const page = stagehand.context.pages()[0];
const agent = stagehand.agent({
mode: "cua",
model: {
modelName: "anthropic/claude-sonnet-4-5-20250929",
apiKey: process.env.ANTHROPIC_API_KEY,
},
systemPrompt: `You are a helpful assistant that can use a web browser.
You are currently on the following page: ${page.url()}.
Do not ask follow up questions, the user will trust your judgement. Today's date is ${new Date().toLocaleDateString()}.`,
tools: {
getWeather,
},
});
await page.goto("https://www.google.com");
const result = await agent.execute({
instruction: "What's the weather in San Francisco? Then search for outdoor activities there.",
maxSteps: 20,
});
console.log("Result:", result);
await stagehand.close();
}
main();
Agent Configuration
mode: “cua”
Set the agent mode to "cua" to enable Computer Use:
const agent = stagehand.agent({
mode: "cua", // Enable Computer Use
// ... other options
});
Supported Models
CUA works with these models:
- Anthropic:
anthropic/claude-sonnet-4-5-20250929, anthropic/claude-sonnet-4-5
- Google:
google/gemini-3-flash-preview, google/gemini-2.0-flash-exp
System Prompt
Provide clear instructions to guide the agent’s behavior:
systemPrompt: `You are a helpful assistant that can use a web browser.
You are currently on the following page: ${page.url()}.
Do not ask follow up questions, the user will trust your judgement.
Today's date is ${new Date().toLocaleDateString()}.`
Max Steps
Control how many actions the agent can take:
const result = await agent.execute({
instruction: "Your instruction here",
maxSteps: 20, // Limit to 20 actions
});
Key Concepts
Computer Use vs Standard Mode
- Standard mode: Uses Stagehand’s action primitives (act, extract, observe)
- CUA mode: Uses computer use APIs to directly control the browser
When to Use CUA
Use Computer Use for:
- Complex UI interactions
- Tasks requiring visual understanding
- Multi-step workflows that benefit from visual context
- Scenarios where standard actions are insufficient
Browser Configuration
CUA requires specific browser dimensions. Configure in stagehand.config.ts:
export default {
browserOptions: {
width: 1920,
height: 1080,
},
};
Error Handling
Always include proper error handling for CUA:
try {
const result = await agent.execute({
instruction,
maxSteps: 20,
});
console.log("Success:", result);
} catch (error) {
console.error("Error:", error);
if (error instanceof Error && error.stack) {
console.log(error.stack);
}
} finally {
await stagehand.close();
}
Best Practices
- Set maxSteps - Prevent infinite loops by limiting agent steps
- Clear instructions - Provide specific, actionable instructions
- System prompts - Guide agent behavior with detailed system prompts
- Error handling - Always handle errors and close browser
- Monitor execution - Use verbose logging to debug agent behavior
- Configure browser - Ensure correct browser dimensions for CUA
Environment Setup
Local Environment
const stagehand = new Stagehand({
env: "LOCAL",
verbose: 2,
});
Browserbase Environment
const stagehand = new Stagehand({
env: "BROWSERBASE",
apiKey: process.env.BROWSERBASE_API_KEY,
projectId: process.env.BROWSERBASE_PROJECT_ID,
});
Next Steps