Deterministic Agents

Overview

Deterministic agents produce consistent, repeatable results. While LLMs are inherently probabilistic, Stagehand provides tools to make agents more predictable and reliable.

Use Caching for Determinism

Caching is the most powerful tool for deterministic behavior.

How Caching Works

const stagehand = new Stagehand({
  env: "LOCAL",
  cacheDir: "./stagehand-cache", // Enable caching
});

await stagehand.init();

// First execution: LLM determines actions
const result1 = await stagehand.act("click the login button");

// Second execution: Replays cached actions (deterministic)
const result2 = await stagehand.act("click the login button");

// result1 and result2 will use the same selector

Cache Keys

Caching is based on:

Instruction text
Page URL
Variable keys (if using variables)

When cache hits:

Actions replay with the same selectors
0 token usage
Consistent behavior (deterministic)

When cache misses:

New LLM inference
May produce different selectors
Non-deterministic until cached

For production workflows, pre-populate the cache in development and deploy with the cache directory.

Self-Healing Determinism

Stagehand’s cache includes self-healing:

const stagehand = new Stagehand({
  env: "LOCAL",
  cacheDir: "./cache",
  selfHeal: true, // Default: enabled
});

How it works:

Cache contains action: click button[data-id='submit']
DOM changes: button now has data-id='submit-form'
Stagehand detects failure, re-inferences, finds new selector
Cache updates automatically with new selector
Future runs use updated cache

Result: Determinism that adapts to changes.

Structured Extraction

Use Zod schemas for deterministic data extraction:

import { z } from "zod";

const productSchema = z.object({
  title: z.string(),
  price: z.number(),
  inStock: z.boolean(),
  rating: z.number().optional(),
});

const result = await stagehand.extract(
  "extract product details",
  { schema: productSchema }
);

// result.extraction is guaranteed to match the schema
// or an error is thrown

Benefits:

Type-safe outputs
Validation ensures consistency
Fails fast if structure doesn’t match

Agent Caching

Agents can cache entire multi-step workflows:

const agent = stagehand.agent();

// First execution: Agent explores and learns
const result1 = await agent.execute({
  instruction: "search for 'laptop' and add first result to cart",
  maxSteps: 10,
});

// Second execution: Replays exact sequence (deterministic)
const result2 = await agent.execute({
  instruction: "search for 'laptop' and add first result to cart",
  maxSteps: 10,
});

// Both executions take the same actions in the same order

Agent Cache Format

Agent cache stores:

Each step’s type (act, extract, goto, scroll, etc.)
Selectors and actions taken
Variables used
Final result

From AgentCache.ts:352-362:

const entry: CachedAgentEntry = {
  version: 1,
  instruction: context.instruction,
  startUrl: context.startUrl,
  options: context.options,
  configSignature: context.configSignature,
  steps: cloneForCache(steps),
  result: this.pruneAgentResult(result),
  timestamp: new Date().toISOString(),
};

Deterministic Actions

Stagehand’s ActHandler supports deterministic action replay:

Action Structure

const actions: Action[] = [
  {
    type: "click",
    selector: "button[data-testid='submit']",
    description: "Click the submit button",
  },
  {
    type: "fill",
    selector: "input[name='email']",
    method: "fill",
    arguments: ["[email protected]"],
    description: "Fill email field",
  },
];

// Replay these actions deterministically
for (const action of actions) {
  await stagehand.act(action);
}

takeDeterministicAction

From ActCache.ts:196-226, Stagehand uses takeDeterministicAction to replay cached actions:

const result = await handler.takeDeterministicAction(
  action,
  page,
  this.domSettleTimeoutMs,
  effectiveClient,
  undefined,
  context.variables
);

This ensures actions replay exactly as cached.

System Prompts for Consistency

Use system prompts to enforce consistent behavior:

const stagehand = new Stagehand({
  env: "LOCAL",
  systemPrompt: `Rules:
  - Always verify actions succeeded before proceeding
  - Use data-testid attributes when available
  - If multiple elements match, choose the first visible one
  - Never click disabled buttons`,
});

const agent = stagehand.agent({
  systemPrompt: `You are a shopping assistant.
  - Always select the lowest-priced option
  - Verify items are in stock before adding to cart
  - Extract prices as numbers without currency symbols`,
});

Variables for Parameterization

Use variables to make workflows deterministic with dynamic inputs:

const searchProduct = async (productName: string) => {
  const agent = stagehand.agent();
  
  return await agent.execute({
    instruction: "search for '{{product}}' and return the first result's price",
    maxSteps: 10,
    variables: { product: productName },
  });
};

// Cache key includes variable names, not values
// So all product searches use the same cached workflow
await searchProduct("laptop"); // Caches workflow
await searchProduct("mouse"); // Reuses cache with different value
await searchProduct("keyboard"); // Reuses cache with different value

Key insight: Cache is keyed by variable names, not values, enabling deterministic workflows with dynamic data.

Limiting Non-Determinism

Set maxSteps

Prevent unbounded exploration:

const agent = stagehand.agent();

await agent.execute({
  instruction: "find and click the submit button",
  maxSteps: 3, // Limits exploration
});

Use Specific Instructions

// ❌ Non-deterministic: agent may explore different paths
await agent.execute({
  instruction: "buy something",
  maxSteps: 50,
});

// ✅ More deterministic: clear path
await agent.execute({
  instruction: "click the 'Buy Now' button for the first product",
  maxSteps: 5,
});

Timeouts and Error Handling

Deterministic error behavior:

try {
  await stagehand.act("click submit button", {
    timeout: 10_000, // Consistent timeout
  });
} catch (error) {
  if (error instanceof ActTimeoutError) {
    console.log("Timed out after 10 seconds");
    // Handle timeout consistently
  }
  throw error;
}

Timeout Error Types

From sdkErrors.ts:334-359:

export class TimeoutError extends StagehandError {
  constructor(operation: string, timeoutMs: number) {
    super(`${operation} timed out after ${timeoutMs}ms`);
  }
}

export class ActTimeoutError extends TimeoutError {
  constructor(timeoutMs: number) {
    super("act()", timeoutMs);
    this.name = "ActTimeoutError";
  }
}

export class ExtractTimeoutError extends TimeoutError {
  constructor(timeoutMs: number) {
    super("extract()", timeoutMs);
    this.name = "ExtractTimeoutError";
  }
}

export class ObserveTimeoutError extends TimeoutError {
  constructor(timeoutMs: number) {
    super("observe()", timeoutMs);
    this.name = "ObserveTimeoutError";
  }
}

Testing Determinism

Replay Test

import { test, expect } from "@playwright/test";

test("workflow is deterministic", async () => {
  const stagehand = new Stagehand({
    env: "LOCAL",
    cacheDir: "./test-cache",
  });
  await stagehand.init();
  
  const page = stagehand.context.pages()[0];
  await page.goto("https://example.com");
  
  // Run workflow twice
  const result1 = await stagehand.extract("get product title");
  const result2 = await stagehand.extract("get product title");
  
  // Results should be identical
  expect(result1.extraction).toEqual(result2.extraction);
  expect(result2.metadata?.cacheHit).toBe(true);
  
  await stagehand.close();
});

Verify Cache Usage

const result = await stagehand.act("click button");

if (result.metadata?.cacheHit) {
  console.log("✓ Using cached action (deterministic)");
  console.log("Cache timestamp:", result.metadata.cacheTimestamp);
} else {
  console.log("⚠ New LLM inference (non-deterministic)");
}

Pre-Warming Cache

For production determinism, pre-warm the cache:

// development.ts
const stagehand = new Stagehand({
  env: "LOCAL",
  cacheDir: "./production-cache",
});
await stagehand.init();

// Run all production workflows once
await runLoginWorkflow(stagehand);
await runSearchWorkflow(stagehand);
await runCheckoutWorkflow(stagehand);

await stagehand.close();

// Deploy ./production-cache to production
// All workflows will now be deterministic

Configuration Signature

Agent cache includes configuration signature to ensure consistency: From AgentCache.ts:107-140:

buildConfigSignature(agentOptions?: AgentConfig): string {
  const toolKeys = agentOptions?.tools
    ? Object.keys(agentOptions.tools).sort()
    : undefined;
  const integrationSignatures = agentOptions?.integrations
    ? agentOptions.integrations.map((integration) =>
        typeof integration === "string" ? integration : "client",
      )
    : undefined;
  const serializedModel = this.serializeAgentModelForCache(
    agentOptions?.model,
  );
  const serializedExecutionModel = this.serializeAgentModelForCache(
    agentOptions?.executionModel,
  );

  const isCuaMode =
    agentOptions?.mode !== undefined
      ? agentOptions.mode === "cua"
      : agentOptions?.cua === true;

  return JSON.stringify({
    v3Model: this.getBaseModelName(),
    systemPrompt: this.getSystemPrompt() ?? "",
    agent: {
      cua: isCuaMode,
      model: serializedModel ?? null,
      executionModel: isCuaMode ? null : serializedExecutionModel,
      systemPrompt: agentOptions?.systemPrompt ?? null,
      toolKeys,
      integrations: integrationSignatures,
    },
  });
}

Changing model, tools, or system prompts invalidates the cache, ensuring consistency.

Best Practices

Always enable caching

Set cacheDir for all workflows

Use structured schemas

Define Zod schemas for extractions

Write specific instructions

Reduce ambiguity and exploration

Limit maxSteps

Prevent unbounded agent exploration

Use variables for dynamic data

Cache workflows, parameterize values

Set consistent timeouts

Predictable error handling

Test with replays

Verify cache hits and consistency

Pre-warm production cache

Deploy with cached workflows

Getting Started

Core Concepts

Core Methods

Configuration

Integrations

Best Practices

Advanced Features

Deterministic Agents

Overview

Use Caching for Determinism

How Caching Works

Cache Keys

Self-Healing Determinism

Structured Extraction

Agent Caching

Agent Cache Format

Deterministic Actions

Action Structure

takeDeterministicAction

System Prompts for Consistency

Variables for Parameterization

Limiting Non-Determinism

Set maxSteps

Use Specific Instructions

Timeouts and Error Handling

Timeout Error Types

Testing Determinism

Replay Test

Verify Cache Usage

Pre-Warming Cache

Configuration Signature

Best Practices

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Core Methods

Configuration

Integrations

Best Practices

Advanced Features

​Overview

​Use Caching for Determinism

​How Caching Works

​Cache Keys

​Self-Healing Determinism

​Structured Extraction

​Agent Caching

​Agent Cache Format

​Deterministic Actions

​Action Structure

​takeDeterministicAction

​System Prompts for Consistency

​Variables for Parameterization

​Limiting Non-Determinism

​Set maxSteps

​Use Specific Instructions

​Timeouts and Error Handling

​Timeout Error Types

​Testing Determinism

​Replay Test

​Verify Cache Usage

​Pre-Warming Cache

​Configuration Signature

​Best Practices

​Related

Build docs developers (and LLMs) love

Overview

Use Caching for Determinism

How Caching Works

Cache Keys

Self-Healing Determinism

Structured Extraction

Agent Caching

Agent Cache Format

Deterministic Actions

Action Structure

takeDeterministicAction

System Prompts for Consistency

Variables for Parameterization

Limiting Non-Determinism

Set maxSteps

Use Specific Instructions

Timeouts and Error Handling

Timeout Error Types

Testing Determinism

Replay Test

Verify Cache Usage

Pre-Warming Cache

Configuration Signature

Best Practices

Related