Skip to main content

Overview

The LLMClient is an abstract base class for Large Language Model integrations. Stagehand provides built-in clients for OpenAI, Anthropic, Google, and other providers.

Accessing the LLM Client

Access the configured LLM client through the Stagehand instance:
const client = stagehand.llmClient;

Properties

type
string
required
Provider type: "openai", "anthropic", "cerebras", "groq", etc.
modelName
AvailableModel
required
Model identifier (e.g., "gpt-4.1-mini", "claude-3-5-sonnet-latest")
hasVision
boolean
required
Whether the model supports vision/images
clientOptions
ClientOptions
required
Client configuration options

Methods

createChatCompletion()

Create a chat completion with the LLM.
// Without schema (returns LLMResponse)
const response = await client.createChatCompletion(options);

// With schema (returns typed data)
const response = await client.createChatCompletion<T>({
  ...options,
  options: {
    ...options.options,
    response_model: { name: "Schema", schema: mySchema },
  },
});
options
CreateChatCompletionOptions
required

Returns

Without schema:
returns
Promise<LLMResponse>
Standard LLM response
With schema:
returns
Promise<LLMParsedResponse<T>>
Parsed response with typed data

Vercel AI SDK Methods

The LLMClient also provides direct access to Vercel AI SDK functions:
// Generate text
const { text } = await client.generateText({ ... });

// Generate structured object
const { object } = await client.generateObject({ ... });

// Stream text
const { textStream } = await client.streamText({ ... });

// Stream structured object
const { partialObjectStream } = await client.streamObject({ ... });

// Embeddings
const { embedding } = await client.embed({ ... });
const { embeddings } = await client.embedMany({ ... });

// Experimental features
const { image } = await client.generateImage({ ... });
const { audio } = await client.generateSpeech({ ... });
const { text } = await client.transcribe({ ... });

Creating Custom LLM Clients

You can provide your own LLM client implementation:
import { LLMClient } from "@browserbasehq/stagehand";

class CustomLLMClient extends LLMClient {
  constructor() {
    super("custom-model");
    this.type = "custom";
    this.hasVision = true;
  }

  async createChatCompletion<T>(options: CreateChatCompletionOptions): Promise<T> {
    // Your implementation
  }
}

const stagehand = new Stagehand({
  env: "LOCAL",
  llmClient: new CustomLLMClient(),
});

Example Usage

Basic Chat Completion

const response = await stagehand.llmClient.createChatCompletion({
  logger: (line) => console.log(line),
  options: {
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: "What is 2+2?" },
    ],
    temperature: 0.7,
  },
});

console.log(response.choices[0].message.content);

Structured Output

import { z } from "zod";

const schema = z.object({
  name: z.string(),
  age: z.number(),
  email: z.string().email(),
});

const response = await stagehand.llmClient.createChatCompletion<
  typeof schema
>({
  logger: (line) => console.log(line),
  options: {
    messages: [
      { role: "user", content: "Extract: John Doe, 30, [email protected]" },
    ],
    response_model: {
      name: "UserInfo",
      schema,
    },
  },
});

const { data } = response;
console.log(data); // { name: "John Doe", age: 30, email: "[email protected]" }

With Image

import fs from "fs";

const imageBuffer = fs.readFileSync("./screenshot.png");

const response = await stagehand.llmClient.createChatCompletion({
  logger: (line) => console.log(line),
  options: {
    messages: [
      { role: "user", content: "What do you see in this image?" },
    ],
    image: {
      buffer: imageBuffer,
      description: "Screenshot of a webpage",
    },
  },
});

Using Vercel AI SDK

import { generateText } from "ai";

const model = stagehand.llmClient.getLanguageModel?.();

if (model) {
  const result = await generateText({
    model,
    prompt: "Write a haiku about coding",
  });
  
  console.log(result.text);
}

Type Definitions

ChatMessage

interface ChatMessage {
  role: "system" | "user" | "assistant";
  content: ChatMessageContent;
}

type ChatMessageContent =
  | string
  | (ChatMessageImageContent | ChatMessageTextContent)[];

interface ChatMessageTextContent {
  type: "text";
  text: string;
}

interface ChatMessageImageContent {
  type: "image_url";
  image_url: { url: string };
}

LLMUsage

interface LLMUsage {
  prompt_tokens: number;
  completion_tokens: number;
  total_tokens: number;
  reasoning_tokens?: number;
  cached_input_tokens?: number;
}

Build docs developers (and LLMs) love