Skip to main content
Stagehand agents support custom tools using the AI SDK’s tool format. This allows you to extend agent capabilities with your own functions, API integrations, and business logic.

Overview

Custom tools enable agents to:
  • Call external APIs (weather, databases, etc.)
  • Execute custom business logic
  • Integrate with third-party services
  • Perform calculations or data transformations
  • Access internal systems
Tools are defined using the AI SDK’s tool() function and Zod schemas for type-safe inputs and outputs.

Creating a Custom Tool

Basic Tool Structure

import { tool } from "ai";
import { z } from "zod";

const myTool = tool({
  description: "Clear description of what the tool does",
  inputSchema: z.object({
    param1: z.string().describe("Description of param1"),
    param2: z.number().describe("Description of param2"),
  }),
  execute: async ({ param1, param2 }) => {
    // Your implementation here
    const result = await doSomething(param1, param2);
    return result;
  },
});

Weather API Example

From packages/core/examples/agent-custom-tools.ts:
import { tool } from "ai";
import { z } from "zod";

// Mock weather API call
const fetchWeatherAPI = async (location: string) => {
  return {
    temp: 70,
    conditions: "sunny",
  };
};

// Define the tool
const getWeather = tool({
  description: "Get the current weather in a location",
  inputSchema: z.object({
    location: z.string().describe("The location to get weather for"),
  }),
  execute: async ({ location }) => {
    const weather = await fetchWeatherAPI(location);
    return {
      location,
      temperature: weather.temp,
      conditions: weather.conditions,
    };
  },
});

Using Custom Tools with Agents

Standard Agent (Non-CUA)

import { Stagehand } from "@browserbasehq/stagehand";

const stagehand = new Stagehand({
  env: "LOCAL",
  verbose: 2,
  experimental: true, // Required for custom tools
});

await stagehand.init();

const agent = stagehand.agent({
  systemPrompt: "You are a helpful assistant.",
  tools: {
    getWeather, // Pass your custom tools
    anotherTool,
  },
});

const result = await agent.execute({
  instruction: "What's the weather in San Francisco?",
});

Computer Use Agent (CUA)

const page = stagehand.context.pages()[0];

const agent = stagehand.agent({
  mode: "cua",
  model: {
    modelName: "anthropic/claude-sonnet-4-5",
    apiKey: process.env.ANTHROPIC_API_KEY,
  },
  systemPrompt: `You are a helpful assistant.
    Current page: ${page.url()}
    Date: ${new Date().toLocaleDateString()}`,
  tools: {
    getWeather,
  },
});

await page.goto("https://www.google.com");

const result = await agent.execute({
  instruction: "What's the weather in San Francisco?",
  maxSteps: 20,
});

Tool Integration in CUA Clients

Anthropic CUA

Location: packages/core/lib/v3/agent/AnthropicCUAClient.ts Converting Tools to Anthropic Format:
// Tools are converted to Anthropic's function declaration format
if (this.tools && Object.keys(this.tools).length > 0) {
  const customTools = Object.entries(this.tools).map(([name, tool]) => {
    const schema = tool.inputSchema as StagehandZodSchema;
    const jsonSchema = toJsonSchema(schema);

    return {
      name,
      description: tool.description,
      input_schema: {
        type: "object",
        properties: jsonSchema.properties || {},
        required: jsonSchema.required || [],
      },
    };
  });

  requestParams.tools = [
    { type: "computer_20251124", name: "computer", /* ... */ },
    ...customTools,
  ];
}
Executing Custom Tools:
// When a custom tool is called
if (this.tools && item.name in this.tools) {
  const tool = this.tools[item.name];
  
  const result = await tool.execute(item.input, {
    toolCallId: item.id,
    messages: [],
  });
  
  toolResult = JSON.stringify(result);
}

// Return result to model
toolResults.push({
  type: "tool_result",
  tool_use_id: item.id,
  content: [{ type: "text", text: toolResult }],
});

Google CUA

Location: packages/core/lib/v3/agent/GoogleCUAClient.ts Converting Tools to Google Format:
import { convertToolSetToFunctionDeclarations } from "./utils/googleCustomToolHandler.js";

const functionDeclarations = 
  this.tools && Object.keys(this.tools).length > 0
    ? convertToolSetToFunctionDeclarations(this.tools)
    : [];

this.generateContentConfig = {
  tools: [
    {
      computerUse: { environment: this.environment },
      ...(functionDeclarations.length > 0 ? { functionDeclarations } : {}),
    },
  ],
};
Executing Custom Tools:
import { executeGoogleCustomTool } from "./utils/googleCustomToolHandler.js";

if (action.type === "custom_tool") {
  const toolName = action.name as string;
  const toolArgs = action.arguments as Record<string, unknown>;

  const executionResult = await executeGoogleCustomTool(
    toolName,
    toolArgs,
    this.tools,
    correspondingFunctionCall,
    logger,
  );

  functionResponses.push(executionResult.functionResponse);
}

OpenAI CUA

Location: packages/core/lib/v3/agent/OpenAICUAClient.ts Converting Tools to OpenAI Format:
if (this.tools && Object.keys(this.tools).length > 0) {
  const customTools = Object.entries(this.tools).map(([name, tool]) => ({
    type: "function" as const,
    name,
    function: {
      name,
      description: tool.description,
      parameters: tool.inputSchema,
    },
  }));

  requestParams.tools = [
    { type: "computer_use_preview", /* ... */ },
    ...customTools,
  ];
}
Executing Custom Tools:
if (item.type === "function_call" && this.isFunctionCallItem(item)) {
  let toolResult = "Tool executed successfully";
  
  if (this.tools && item.name in this.tools) {
    const tool = this.tools[item.name];
    const args = JSON.parse(item.arguments);

    const result = await tool.execute(args, {
      toolCallId: item.call_id,
      messages: [],
    });
    
    toolResult = JSON.stringify(result);
  }

  nextInputItems.push({
    type: "function_call_output",
    call_id: item.call_id,
    output: toolResult,
  });
}

Built-in Agent Tools

Stagehand provides built-in tools for standard agents. Location: packages/core/lib/v3/agent/tools/

Core Tools

act - Perform actions (click, type):
import { actTool } from "./act.js";

const act = actTool(v3, executionModel, variables);
// Usage: { action: "click the Login button" }
extract - Extract structured data:
import { extractTool } from "./extract.js";

const extract = extractTool(v3, executionModel);
// Usage: {
//   instruction: "extract product name and price",
//   schema: { type: "object", properties: { /* ... */ } }
// }
ariaTree - Get accessibility tree:
import { ariaTreeTool } from "./ariaTree.js";

const ariaTree = ariaTreeTool(v3);
goto - Navigate to URL:
import { gotoTool } from "./goto.js";

const goto = gotoTool(v3);
// Usage: { url: "https://example.com" }
screenshot - Capture screenshot:
import { screenshotTool } from "./screenshot.js";

const screenshot = screenshotTool(v3);

Coordinate-Based Tools (Hybrid Mode)

click - Click at coordinates:
import { clickTool } from "./click.js";

const click = clickTool(v3, provider);
type - Type text:
import { typeTool } from "./type.js";

const type = typeTool(v3, provider, variables);
dragAndDrop - Drag and drop:
import { dragAndDropTool } from "./dragAndDrop.js";

const dragAndDrop = dragAndDropTool(v3, provider);

Form Tools

fillForm - Fill forms using DOM (DOM mode):
import { fillFormTool } from "./fillform.js";

const fillForm = fillFormTool(v3, executionModel, variables);
fillFormVision - Fill forms using vision (Hybrid mode):
import { fillFormVisionTool } from "./fillFormVision.js";

const fillFormVision = fillFormVisionTool(v3, provider, variables);

Tool Modes

Location: packages/core/lib/v3/agent/tools/index.ts Stagehand supports two tool modes:

DOM Mode (Default)

const agent = stagehand.agent({
  mode: "dom", // or omit, as it's the default
  tools: {
    // Custom tools
  },
});
Available tools:
  • act - DOM-based click/type
  • fillForm - DOM-based form filling
  • extract - Structured data extraction
  • ariaTree - Accessibility tree
  • goto, navback, screenshot, scroll, wait, keys, think
Excluded tools:
  • click, type, dragAndDrop, clickAndHold, fillFormVision

Hybrid Mode

const agent = stagehand.agent({
  mode: "hybrid",
  tools: {
    // Custom tools
  },
});
Available tools:
  • click, type - Coordinate-based actions
  • fillFormVision - Vision-based form filling
  • dragAndDrop, clickAndHold - Advanced interactions
  • All other standard tools
Excluded tools:
  • fillForm (replaced by fillFormVision)

Filtering Tools

export function createAgentTools(v3: V3, options?: V3AgentToolOptions) {
  const mode = options?.mode ?? "dom";
  const excludeTools = options?.excludeTools;

  const allTools: ToolSet = {
    act: actTool(v3, executionModel, variables),
    click: clickTool(v3, provider),
    // ... all tools
  };

  return filterTools(allTools, mode, excludeTools);
}

function filterTools(
  tools: ToolSet,
  mode: AgentToolMode,
  excludeTools?: string[],
): ToolSet {
  const filtered: ToolSet = { ...tools };

  if (mode === "hybrid") {
    delete filtered.fillForm;
  } else {
    // DOM mode
    delete filtered.click;
    delete filtered.type;
    delete filtered.dragAndDrop;
    delete filtered.clickAndHold;
    delete filtered.fillFormVision;
  }

  if (excludeTools) {
    for (const toolName of excludeTools) {
      delete filtered[toolName];
    }
  }

  return filtered;
}

Variables in Tools

Tools like act, type, and fillForm support variables:
const agent = stagehand.agent({
  variables: {
    email: "[email protected]",
    password: "secret123",
  },
});

// Agent can use: "type %email% into the email field"
Implementation in actTool:
export const actTool = (
  v3: V3,
  executionModel?: string | AgentModelConfig,
  variables?: Variables,
) => {
  const hasVariables = variables && Object.keys(variables).length > 0;
  const actionDescription = hasVariables
    ? `Available variables: ${Object.keys(variables).join(", ")}`
    : 'e.g. "click the Login button" or "type John into the input"';

  return tool({
    description: "Perform an action on the page",
    inputSchema: z.object({
      action: z.string().describe(actionDescription),
    }),
    execute: async ({ action }) => {
      const options = executionModel
        ? { model: executionModel, variables }
        : { variables };
      const result = await v3.act(action, options);
      return {
        success: result.success ?? true,
        action: result.actionDescription ?? action,
      };
    },
  });
};

Advanced Tool Examples

Database Query Tool

const queryDatabase = tool({
  description: "Query the product database",
  inputSchema: z.object({
    query: z.string().describe("SQL query to execute"),
  }),
  execute: async ({ query }) => {
    const db = await connectToDatabase();
    const results = await db.query(query);
    return results;
  },
});

API Integration Tool

const sendSlackMessage = tool({
  description: "Send a message to Slack",
  inputSchema: z.object({
    channel: z.string(),
    message: z.string(),
  }),
  execute: async ({ channel, message }) => {
    const response = await fetch("https://slack.com/api/chat.postMessage", {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${process.env.SLACK_TOKEN}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ channel, text: message }),
    });
    return response.json();
  },
});

Data Transformation Tool

const parseCSV = tool({
  description: "Parse CSV data into JSON",
  inputSchema: z.object({
    csv: z.string().describe("CSV data to parse"),
  }),
  execute: async ({ csv }) => {
    const rows = csv.split("\n");
    const headers = rows[0].split(",");
    const data = rows.slice(1).map((row) => {
      const values = row.split(",");
      return headers.reduce((obj, header, i) => {
        obj[header] = values[i];
        return obj;
      }, {});
    });
    return data;
  },
});

Best Practices

  1. Clear descriptions: Help the model understand when to use each tool
  2. Type-safe schemas: Use Zod for input validation
  3. Error handling: Return meaningful error messages
  4. Logging: Log tool executions for debugging
  5. Async execution: Tools can perform async operations
  6. Return structured data: Return JSON-serializable results
  7. Enable experimental mode: Required for custom tools in Stagehand

Type Safety

import type { ToolSet, InferUITools } from "ai";

type MyTools = {
  getWeather: typeof getWeather;
  queryDB: typeof queryDatabase;
};

type ToolInputs = InferUITools<MyTools>;
// Access tool inputs/outputs with full type safety

References

  • Custom Tools Example: packages/core/examples/agent-custom-tools.ts
  • Tool Implementations: packages/core/lib/v3/agent/tools/
  • Anthropic Tool Handler: packages/core/lib/v3/agent/AnthropicCUAClient.ts:471-500
  • Google Tool Handler: packages/core/lib/v3/agent/utils/googleCustomToolHandler.ts
  • OpenAI Tool Handler: packages/core/lib/v3/agent/OpenAICUAClient.ts:444-460

Build docs developers (and LLMs) love