Metrics Overview
Access comprehensive metrics for all Stagehand operations:import { Stagehand } from '@browserbasehq/stagehand';
const stagehand = new Stagehand({
env: "LOCAL",
model: "gpt-4o",
});
await stagehand.init();
// Perform operations
await stagehand.act("click the login button");
const data = await stagehand.extract("get product details", schema);
// Get metrics
const metrics = await stagehand.metrics;
console.log(metrics);
StagehandMetrics Interface
Themetrics object provides detailed usage statistics:
interface StagehandMetrics {
// Act operation metrics
actPromptTokens: number; // Input tokens for act()
actCompletionTokens: number; // Output tokens for act()
actReasoningTokens: number; // Reasoning tokens for act()
actCachedInputTokens: number; // Cached input tokens for act()
actInferenceTimeMs: number; // Total inference time for act()
// Extract operation metrics
extractPromptTokens: number;
extractCompletionTokens: number;
extractReasoningTokens: number;
extractCachedInputTokens: number;
extractInferenceTimeMs: number;
// Observe operation metrics
observePromptTokens: number;
observeCompletionTokens: number;
observeReasoningTokens: number;
observeCachedInputTokens: number;
observeInferenceTimeMs: number;
// Agent operation metrics
agentPromptTokens: number;
agentCompletionTokens: number;
agentReasoningTokens: number;
agentCachedInputTokens: number;
agentInferenceTimeMs: number;
// Totals across all operations
totalPromptTokens: number;
totalCompletionTokens: number;
totalReasoningTokens: number;
totalCachedInputTokens: number;
totalInferenceTimeMs: number;
}
Token Usage Tracking
Monitor token consumption for cost optimization:const stagehand = new Stagehand({
env: "LOCAL",
model: "gpt-4o",
});
await stagehand.init();
// Perform multiple operations
await stagehand.act("fill in the form");
await stagehand.act("click submit");
const result = await stagehand.extract("extract form data", schema);
// Check token usage
const metrics = await stagehand.metrics;
console.log(`Total input tokens: ${metrics.totalPromptTokens}`);
console.log(`Total output tokens: ${metrics.totalCompletionTokens}`);
console.log(`Cached tokens: ${metrics.totalCachedInputTokens}`);
console.log(`Total inference time: ${metrics.totalInferenceTimeMs}ms`);
Per-Operation Metrics
Track metrics for specific operation types:const metrics = await stagehand.metrics;
// Act operation statistics
console.log('Act Operations:');
console.log(` Input tokens: ${metrics.actPromptTokens}`);
console.log(` Output tokens: ${metrics.actCompletionTokens}`);
console.log(` Inference time: ${metrics.actInferenceTimeMs}ms`);
// Extract operation statistics
console.log('Extract Operations:');
console.log(` Input tokens: ${metrics.extractPromptTokens}`);
console.log(` Output tokens: ${metrics.extractCompletionTokens}`);
console.log(` Inference time: ${metrics.extractInferenceTimeMs}ms`);
// Agent operation statistics
console.log('Agent Operations:');
console.log(` Input tokens: ${metrics.agentPromptTokens}`);
console.log(` Output tokens: ${metrics.agentCompletionTokens}`);
console.log(` Reasoning tokens: ${metrics.agentReasoningTokens}`);
console.log(` Inference time: ${metrics.agentInferenceTimeMs}ms`);
Cost Calculation
Calculate costs based on token usage:const metrics = await stagehand.metrics;
// GPT-4o pricing (example rates)
const INPUT_COST_PER_1M = 5.00; // $5 per 1M input tokens
const OUTPUT_COST_PER_1M = 15.00; // $15 per 1M output tokens
const CACHED_COST_PER_1M = 2.50; // $2.50 per 1M cached tokens
const inputCost = (metrics.totalPromptTokens / 1_000_000) * INPUT_COST_PER_1M;
const outputCost = (metrics.totalCompletionTokens / 1_000_000) * OUTPUT_COST_PER_1M;
const cachedCost = (metrics.totalCachedInputTokens / 1_000_000) * CACHED_COST_PER_1M;
const totalCost = inputCost + outputCost + cachedCost;
console.log(`Total cost: $${totalCost.toFixed(4)}`);
console.log(` Input tokens cost: $${inputCost.toFixed(4)}`);
console.log(` Output tokens cost: $${outputCost.toFixed(4)}`);
console.log(` Cached tokens cost: $${cachedCost.toFixed(4)}`);
History Tracking
Access the complete history of operations:const stagehand = new Stagehand({
env: "LOCAL",
model: "gpt-4o",
});
await stagehand.init();
// Perform operations
await stagehand.act("click login");
await stagehand.extract("get user data", schema);
// Get operation history
const history = await stagehand.history;
for (const entry of history) {
console.log(`Operation: ${entry.method}`);
console.log(`Timestamp: ${entry.timestamp}`);
console.log(`Duration: ${entry.endTime - entry.timestamp}ms`);
if (entry.tokenUsage) {
console.log(`Tokens used: ${entry.tokenUsage.inputTokens + entry.tokenUsage.outputTokens}`);
console.log(`Cost: $${entry.tokenUsage.cost}`);
}
}
HistoryEntry Interface
interface HistoryEntry {
method: string; // Operation name (act, extract, observe, etc.)
parameters: Record<string, unknown>; // Operation parameters
result: Record<string, unknown>; // Operation result
timestamp: number; // Start timestamp (ms since epoch)
endTime?: number; // End timestamp (ms since epoch)
tokenUsage?: { // Token usage for this operation
inputTokens?: number;
outputTokens?: number;
timeMs?: number;
cost?: number;
};
}
Replay Metrics
Retrieve session replay data with detailed action-level metrics:const stagehand = new Stagehand({
env: "BROWSERBASE",
apiKey: process.env.BROWSERBASE_API_KEY,
projectId: process.env.BROWSERBASE_PROJECT_ID,
});
await stagehand.init();
// Perform operations
await stagehand.act("navigate and login");
// Get replay metrics (Browserbase only)
const sessionId = stagehand.browserbaseSessionID;
// Access via Browserbase API or internal methods
// Replay data includes page URLs, actions, timestamps, and token usage
Custom Monitoring Integration
Integrate Stagehand metrics with your monitoring stack:Datadog Example
import { StatsD } from 'hot-shots';
const dogstatsd = new StatsD();
const stagehand = new Stagehand({
env: "LOCAL",
model: "gpt-4o",
});
await stagehand.init();
// Perform operations
await stagehand.act("click button");
// Send metrics to Datadog
const metrics = await stagehand.metrics;
dogstatsd.gauge('stagehand.tokens.input', metrics.totalPromptTokens);
dogstatsd.gauge('stagehand.tokens.output', metrics.totalCompletionTokens);
dogstatsd.gauge('stagehand.tokens.cached', metrics.totalCachedInputTokens);
dogstatsd.timing('stagehand.inference.time', metrics.totalInferenceTimeMs);
Prometheus Example
import client from 'prom-client';
const tokenCounter = new client.Counter({
name: 'stagehand_tokens_total',
help: 'Total tokens used by Stagehand',
labelNames: ['type', 'operation'],
});
const inferenceHistogram = new client.Histogram({
name: 'stagehand_inference_duration_ms',
help: 'LLM inference duration in milliseconds',
labelNames: ['operation'],
});
const stagehand = new Stagehand({
env: "LOCAL",
model: "gpt-4o",
});
await stagehand.init();
await stagehand.act("perform action");
const metrics = await stagehand.metrics;
// Track metrics in Prometheus
tokenCounter.inc({ type: 'input', operation: 'act' }, metrics.actPromptTokens);
tokenCounter.inc({ type: 'output', operation: 'act' }, metrics.actCompletionTokens);
inferenceHistogram.observe({ operation: 'act' }, metrics.actInferenceTimeMs);
CloudWatch Example
import { CloudWatch } from '@aws-sdk/client-cloudwatch';
const cloudwatch = new CloudWatch({ region: 'us-east-1' });
const stagehand = new Stagehand({
env: "LOCAL",
model: "gpt-4o",
});
await stagehand.init();
await stagehand.act("perform action");
const metrics = await stagehand.metrics;
// Send metrics to CloudWatch
await cloudwatch.putMetricData({
Namespace: 'Stagehand',
MetricData: [
{
MetricName: 'InputTokens',
Value: metrics.totalPromptTokens,
Unit: 'Count',
},
{
MetricName: 'OutputTokens',
Value: metrics.totalCompletionTokens,
Unit: 'Count',
},
{
MetricName: 'InferenceTime',
Value: metrics.totalInferenceTimeMs,
Unit: 'Milliseconds',
},
],
});
Monitoring Best Practices
Track Cost Trends
Monitor token usage over time to identify cost optimization opportunities.
const metrics = await stagehand.metrics;
const cost = calculateCost(metrics);
logToMonitoring({ cost, timestamp: Date.now() });
Set Cost Alerts
Configure alerts when token usage exceeds thresholds.
if (metrics.totalPromptTokens > 1_000_000) {
alert('High token usage detected');
}
Performance Monitoring
Track inference time to identify slow operations.
if (metrics.totalInferenceTimeMs > 10000) {
console.warn('Slow inference detected');
}
Cache Efficiency
Monitor cached token usage to measure cache effectiveness.
const cacheHitRate = metrics.totalCachedInputTokens /
(metrics.totalPromptTokens || 1);
console.log(`Cache hit rate: ${(cacheHitRate * 100).toFixed(2)}%`);
Agent Replay Tracking
Track agent execution for replay and debugging:const stagehand = new Stagehand({
env: "LOCAL",
model: "gpt-4o",
});
await stagehand.init();
const agent = stagehand.agent();
await agent.execute("complete the checkout process");
// Check if agent replay is active
if (stagehand.isAgentReplayActive()) {
console.log('Agent replay is being recorded');
}
// Access history for replay
const history = await stagehand.history;
for (const step of history) {
console.log(`Step: ${step.method}`);
console.log(`Duration: ${step.endTime - step.timestamp}ms`);
}
Debugging with Metrics
Identify expensive operations
Identify expensive operations
Find which operations consume the most tokens:
const metrics = await stagehand.metrics;
const operations = [
{ name: 'act', tokens: metrics.actPromptTokens + metrics.actCompletionTokens },
{ name: 'extract', tokens: metrics.extractPromptTokens + metrics.extractCompletionTokens },
{ name: 'observe', tokens: metrics.observePromptTokens + metrics.observeCompletionTokens },
{ name: 'agent', tokens: metrics.agentPromptTokens + metrics.agentCompletionTokens },
];
operations.sort((a, b) => b.tokens - a.tokens);
console.log('Most expensive operations:', operations);
Measure cache effectiveness
Measure cache effectiveness
Calculate cache hit rates:
const metrics = await stagehand.metrics;
const totalInput = metrics.totalPromptTokens + metrics.totalCachedInputTokens;
const cacheHitRate = metrics.totalCachedInputTokens / totalInput;
console.log(`Cache hit rate: ${(cacheHitRate * 100).toFixed(2)}%`);
console.log(`Savings: ${metrics.totalCachedInputTokens} tokens`);
Optimize slow operations
Optimize slow operations
Find operations with high inference time:
const history = await stagehand.history;
const slowOps = history.filter(entry =>
(entry.endTime - entry.timestamp) > 5000
);
console.log('Slow operations:', slowOps);
Best Practice: Regularly review metrics in production to identify optimization opportunities and control costs.