Skip to main content

Overview

Deterministic agents produce consistent, repeatable results. While LLMs are inherently probabilistic, Stagehand provides tools to make agents more predictable and reliable.

Use Caching for Determinism

Caching is the most powerful tool for deterministic behavior.

How Caching Works

const stagehand = new Stagehand({
  env: "LOCAL",
  cacheDir: "./stagehand-cache", // Enable caching
});

await stagehand.init();

// First execution: LLM determines actions
const result1 = await stagehand.act("click the login button");

// Second execution: Replays cached actions (deterministic)
const result2 = await stagehand.act("click the login button");

// result1 and result2 will use the same selector

Cache Keys

Caching is based on:
  • Instruction text
  • Page URL
  • Variable keys (if using variables)
When cache hits:
  • Actions replay with the same selectors
  • 0 token usage
  • Consistent behavior (deterministic)
When cache misses:
  • New LLM inference
  • May produce different selectors
  • Non-deterministic until cached
For production workflows, pre-populate the cache in development and deploy with the cache directory.

Self-Healing Determinism

Stagehand’s cache includes self-healing:
const stagehand = new Stagehand({
  env: "LOCAL",
  cacheDir: "./cache",
  selfHeal: true, // Default: enabled
});
How it works:
  1. Cache contains action: click button[data-id='submit']
  2. DOM changes: button now has data-id='submit-form'
  3. Stagehand detects failure, re-inferences, finds new selector
  4. Cache updates automatically with new selector
  5. Future runs use updated cache
Result: Determinism that adapts to changes.

Structured Extraction

Use Zod schemas for deterministic data extraction:
import { z } from "zod";

const productSchema = z.object({
  title: z.string(),
  price: z.number(),
  inStock: z.boolean(),
  rating: z.number().optional(),
});

const result = await stagehand.extract(
  "extract product details",
  { schema: productSchema }
);

// result.extraction is guaranteed to match the schema
// or an error is thrown
Benefits:
  • Type-safe outputs
  • Validation ensures consistency
  • Fails fast if structure doesn’t match

Agent Caching

Agents can cache entire multi-step workflows:
const agent = stagehand.agent();

// First execution: Agent explores and learns
const result1 = await agent.execute({
  instruction: "search for 'laptop' and add first result to cart",
  maxSteps: 10,
});

// Second execution: Replays exact sequence (deterministic)
const result2 = await agent.execute({
  instruction: "search for 'laptop' and add first result to cart",
  maxSteps: 10,
});

// Both executions take the same actions in the same order

Agent Cache Format

Agent cache stores:
  • Each step’s type (act, extract, goto, scroll, etc.)
  • Selectors and actions taken
  • Variables used
  • Final result
From AgentCache.ts:352-362:
const entry: CachedAgentEntry = {
  version: 1,
  instruction: context.instruction,
  startUrl: context.startUrl,
  options: context.options,
  configSignature: context.configSignature,
  steps: cloneForCache(steps),
  result: this.pruneAgentResult(result),
  timestamp: new Date().toISOString(),
};

Deterministic Actions

Stagehand’s ActHandler supports deterministic action replay:

Action Structure

const actions: Action[] = [
  {
    type: "click",
    selector: "button[data-testid='submit']",
    description: "Click the submit button",
  },
  {
    type: "fill",
    selector: "input[name='email']",
    method: "fill",
    arguments: ["[email protected]"],
    description: "Fill email field",
  },
];

// Replay these actions deterministically
for (const action of actions) {
  await stagehand.act(action);
}

takeDeterministicAction

From ActCache.ts:196-226, Stagehand uses takeDeterministicAction to replay cached actions:
const result = await handler.takeDeterministicAction(
  action,
  page,
  this.domSettleTimeoutMs,
  effectiveClient,
  undefined,
  context.variables
);
This ensures actions replay exactly as cached.

System Prompts for Consistency

Use system prompts to enforce consistent behavior:
const stagehand = new Stagehand({
  env: "LOCAL",
  systemPrompt: `Rules:
  - Always verify actions succeeded before proceeding
  - Use data-testid attributes when available
  - If multiple elements match, choose the first visible one
  - Never click disabled buttons`,
});

const agent = stagehand.agent({
  systemPrompt: `You are a shopping assistant.
  - Always select the lowest-priced option
  - Verify items are in stock before adding to cart
  - Extract prices as numbers without currency symbols`,
});

Variables for Parameterization

Use variables to make workflows deterministic with dynamic inputs:
const searchProduct = async (productName: string) => {
  const agent = stagehand.agent();
  
  return await agent.execute({
    instruction: "search for '{{product}}' and return the first result's price",
    maxSteps: 10,
    variables: { product: productName },
  });
};

// Cache key includes variable names, not values
// So all product searches use the same cached workflow
await searchProduct("laptop"); // Caches workflow
await searchProduct("mouse"); // Reuses cache with different value
await searchProduct("keyboard"); // Reuses cache with different value
Key insight: Cache is keyed by variable names, not values, enabling deterministic workflows with dynamic data.

Limiting Non-Determinism

Set maxSteps

Prevent unbounded exploration:
const agent = stagehand.agent();

await agent.execute({
  instruction: "find and click the submit button",
  maxSteps: 3, // Limits exploration
});

Use Specific Instructions

// ❌ Non-deterministic: agent may explore different paths
await agent.execute({
  instruction: "buy something",
  maxSteps: 50,
});

// ✅ More deterministic: clear path
await agent.execute({
  instruction: "click the 'Buy Now' button for the first product",
  maxSteps: 5,
});

Timeouts and Error Handling

Deterministic error behavior:
try {
  await stagehand.act("click submit button", {
    timeout: 10_000, // Consistent timeout
  });
} catch (error) {
  if (error instanceof ActTimeoutError) {
    console.log("Timed out after 10 seconds");
    // Handle timeout consistently
  }
  throw error;
}

Timeout Error Types

From sdkErrors.ts:334-359:
export class TimeoutError extends StagehandError {
  constructor(operation: string, timeoutMs: number) {
    super(`${operation} timed out after ${timeoutMs}ms`);
  }
}

export class ActTimeoutError extends TimeoutError {
  constructor(timeoutMs: number) {
    super("act()", timeoutMs);
    this.name = "ActTimeoutError";
  }
}

export class ExtractTimeoutError extends TimeoutError {
  constructor(timeoutMs: number) {
    super("extract()", timeoutMs);
    this.name = "ExtractTimeoutError";
  }
}

export class ObserveTimeoutError extends TimeoutError {
  constructor(timeoutMs: number) {
    super("observe()", timeoutMs);
    this.name = "ObserveTimeoutError";
  }
}

Testing Determinism

Replay Test

import { test, expect } from "@playwright/test";

test("workflow is deterministic", async () => {
  const stagehand = new Stagehand({
    env: "LOCAL",
    cacheDir: "./test-cache",
  });
  await stagehand.init();
  
  const page = stagehand.context.pages()[0];
  await page.goto("https://example.com");
  
  // Run workflow twice
  const result1 = await stagehand.extract("get product title");
  const result2 = await stagehand.extract("get product title");
  
  // Results should be identical
  expect(result1.extraction).toEqual(result2.extraction);
  expect(result2.metadata?.cacheHit).toBe(true);
  
  await stagehand.close();
});

Verify Cache Usage

const result = await stagehand.act("click button");

if (result.metadata?.cacheHit) {
  console.log("✓ Using cached action (deterministic)");
  console.log("Cache timestamp:", result.metadata.cacheTimestamp);
} else {
  console.log("⚠ New LLM inference (non-deterministic)");
}

Pre-Warming Cache

For production determinism, pre-warm the cache:
// development.ts
const stagehand = new Stagehand({
  env: "LOCAL",
  cacheDir: "./production-cache",
});
await stagehand.init();

// Run all production workflows once
await runLoginWorkflow(stagehand);
await runSearchWorkflow(stagehand);
await runCheckoutWorkflow(stagehand);

await stagehand.close();

// Deploy ./production-cache to production
// All workflows will now be deterministic

Configuration Signature

Agent cache includes configuration signature to ensure consistency: From AgentCache.ts:107-140:
buildConfigSignature(agentOptions?: AgentConfig): string {
  const toolKeys = agentOptions?.tools
    ? Object.keys(agentOptions.tools).sort()
    : undefined;
  const integrationSignatures = agentOptions?.integrations
    ? agentOptions.integrations.map((integration) =>
        typeof integration === "string" ? integration : "client",
      )
    : undefined;
  const serializedModel = this.serializeAgentModelForCache(
    agentOptions?.model,
  );
  const serializedExecutionModel = this.serializeAgentModelForCache(
    agentOptions?.executionModel,
  );

  const isCuaMode =
    agentOptions?.mode !== undefined
      ? agentOptions.mode === "cua"
      : agentOptions?.cua === true;

  return JSON.stringify({
    v3Model: this.getBaseModelName(),
    systemPrompt: this.getSystemPrompt() ?? "",
    agent: {
      cua: isCuaMode,
      model: serializedModel ?? null,
      executionModel: isCuaMode ? null : serializedExecutionModel,
      systemPrompt: agentOptions?.systemPrompt ?? null,
      toolKeys,
      integrations: integrationSignatures,
    },
  });
}
Changing model, tools, or system prompts invalidates the cache, ensuring consistency.

Best Practices

1

Always enable caching

Set cacheDir for all workflows
2

Use structured schemas

Define Zod schemas for extractions
3

Write specific instructions

Reduce ambiguity and exploration
4

Limit maxSteps

Prevent unbounded agent exploration
5

Use variables for dynamic data

Cache workflows, parameterize values
6

Set consistent timeouts

Predictable error handling
7

Test with replays

Verify cache hits and consistency
8

Pre-warm production cache

Deploy with cached workflows

Build docs developers (and LLMs) love