Skip to main content
Stagehand uses Large Language Models (LLMs) to power AI-driven browser automation. You can configure which model to use globally or per-operation.

Quick Start

Set a default model when initializing Stagehand:
import { Stagehand } from '@browserbasehq/stagehand';

const stagehand = new Stagehand({
  env: "LOCAL",
  model: "gpt-4o",
});

await stagehand.init();

Model Configuration Options

You can configure models in two ways:

1. String Model Name

Use a simple string for supported models:
model: "gpt-4o"
model: "claude-3-5-sonnet-latest"
model: "gemini-2.0-flash"

2. Model Configuration Object

For advanced configuration, use an object:
model: {
  modelName: "gpt-4o",
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "https://api.openai.com/v1",
  temperature: 0.7,
}

Supported Model Providers

OpenAI

Stagehand supports all OpenAI models:
const stagehand = new Stagehand({
  env: "LOCAL",
  model: {
    modelName: "gpt-4o",
    apiKey: process.env.OPENAI_API_KEY,
  },
});
Available OpenAI Models:
  • gpt-4.1 - Latest GPT-4.1 model
  • gpt-4.1-mini - Fast, cost-efficient GPT-4.1
  • gpt-4.1-nano - Ultra-fast, minimal cost
  • gpt-4o - GPT-4 Optimized
  • gpt-4o-mini - GPT-4 Optimized Mini
  • gpt-4o-2024-08-06 - Specific version
  • gpt-4.5-preview - Latest preview
  • o1 - OpenAI O1 model
  • o1-mini - OpenAI O1 Mini
  • o1-preview - OpenAI O1 Preview
  • o3 - OpenAI O3 model
  • o3-mini - OpenAI O3 Mini
  • o4-mini - OpenAI O4 Mini

Anthropic

Use Claude models from Anthropic:
const stagehand = new Stagehand({
  env: "LOCAL",
  model: {
    modelName: "claude-3-7-sonnet-latest",
    apiKey: process.env.ANTHROPIC_API_KEY,
  },
});
Available Anthropic Models:
  • claude-3-7-sonnet-latest - Latest Claude 3.7 Sonnet
  • claude-3-7-sonnet-20250219 - Claude 3.7 Sonnet (Feb 2025)
  • claude-3-5-sonnet-latest - Latest Claude 3.5 Sonnet
  • claude-3-5-sonnet-20241022 - Claude 3.5 Sonnet (Oct 2024)
  • claude-3-5-sonnet-20240620 - Claude 3.5 Sonnet (June 2024)
Claude models support extended thinking via the thinkingBudget parameter for complex reasoning tasks.

Google Gemini

Use Google’s Gemini models:
const stagehand = new Stagehand({
  env: "LOCAL",
  model: {
    modelName: "gemini-2.5-flash-preview-04-17",
    apiKey: process.env.GOOGLE_API_KEY,
  },
});
Available Google Models:
  • gemini-2.5-flash-preview-04-17 - Latest Gemini 2.5 Flash Preview
  • gemini-2.5-pro-preview-03-25 - Latest Gemini 2.5 Pro Preview
  • gemini-2.0-flash - Gemini 2.0 Flash
  • gemini-2.0-flash-lite - Lightweight Gemini 2.0
  • gemini-1.5-pro - Gemini 1.5 Pro
  • gemini-1.5-flash - Gemini 1.5 Flash
  • gemini-1.5-flash-8b - Compact Gemini 1.5

Google Vertex AI

For Google Vertex AI with service account credentials:
const stagehand = new Stagehand({
  env: "LOCAL",
  model: {
    modelName: "gemini-2.0-flash",
    project: "my-gcp-project",
    location: "us-central1",
    googleAuthOptions: {
      credentials: {
        type: "service_account",
        project_id: "my-project",
        private_key_id: "key-id",
        private_key: process.env.GOOGLE_PRIVATE_KEY,
        client_email: "[email protected]",
        client_id: "12345",
      },
    },
  },
});

Cerebras

Use Cerebras for ultra-fast inference:
const stagehand = new Stagehand({
  env: "LOCAL",
  model: {
    modelName: "cerebras-llama-3.3-70b",
    apiKey: process.env.CEREBRAS_API_KEY,
  },
});
Available Cerebras Models:
  • cerebras-llama-3.3-70b - Llama 3.3 70B
  • cerebras-llama-3.1-8b - Llama 3.1 8B

Groq

Use Groq for fast inference:
const stagehand = new Stagehand({
  env: "LOCAL",
  model: {
    modelName: "groq-llama-3.3-70b-versatile",
    apiKey: process.env.GROQ_API_KEY,
  },
});
Available Groq Models:
  • groq-llama-3.3-70b-versatile - Llama 3.3 70B Versatile
  • groq-llama-3.3-70b-specdec - Llama 3.3 70B SpecDec

Advanced Configuration

Custom Base URL

Use OpenAI-compatible APIs:
const stagehand = new Stagehand({
  env: "LOCAL",
  model: {
    modelName: "gpt-4o",
    baseURL: "https://my-custom-api.com/v1",
    apiKey: process.env.CUSTOM_API_KEY,
  },
});

Temperature Control

Adjust model creativity (0.0 = deterministic, 2.0 = creative):
model: {
  modelName: "gpt-4o",
  temperature: 0.2, // More deterministic for reliable automation
}

Organization ID (OpenAI)

Specify OpenAI organization:
model: {
  modelName: "gpt-4o",
  apiKey: process.env.OPENAI_API_KEY,
  organization: "org-123456",
}

Extended Thinking (Anthropic)

Enable extended thinking for complex reasoning:
model: {
  modelName: "claude-3-7-sonnet-latest",
  thinkingBudget: 10000, // Thinking budget in tokens
}

Per-Operation Model Override

Override the model for specific operations:
// Use a fast model for simple actions
await stagehand.act("click the login button", {
  model: "gpt-4o-mini",
});

// Use a powerful model for complex extractions
const data = await stagehand.extract(
  "extract all product details",
  schema,
  {
    model: "claude-3-7-sonnet-latest",
  }
);

// Use a different model for observations
const actions = await stagehand.observe(
  "find all interactive elements",
  {
    model: "gemini-2.0-flash",
  }
);

Custom LLM Client

Provide your own LLM client implementation:
import { LLMClient } from '@browserbasehq/stagehand';

class CustomLLMClient extends LLMClient {
  // Implement custom LLM logic
}

const stagehand = new Stagehand({
  env: "LOCAL",
  llmClient: new CustomLLMClient(),
});

Environment Variables

Set API keys via environment variables:
# .env file
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...
CEREBRAS_API_KEY=...
GROQ_API_KEY=...
Stagehand automatically uses these environment variables when you don’t explicitly provide an apiKey:
const stagehand = new Stagehand({
  env: "LOCAL",
  model: "gpt-4o", // Uses OPENAI_API_KEY from env
});

Model Selection Best Practices

Fast Actions

Use lightweight models for simple clicks and navigation:
  • gpt-4o-mini
  • gpt-4.1-nano
  • gemini-2.0-flash-lite

Complex Extraction

Use powerful models for data extraction:
  • claude-3-7-sonnet-latest
  • gpt-4.1
  • gemini-2.5-pro-preview-03-25

Cost Optimization

Balance performance and cost:
  • Use mini/nano models for repetitive tasks
  • Cache agent actions to reduce LLM calls
  • Use temperature: 0 for deterministic results

Reasoning Tasks

Use reasoning-focused models:
  • o1 / o3 series for complex logic
  • claude-3-7-sonnet-latest with thinkingBudget
Start with gpt-4o for general use, then optimize by using faster/cheaper models for specific operations.

Build docs developers (and LLMs) love