Skip to main content

Overview

Observatory uses a hierarchical model to organize agent telemetry:
  • Session: A logical grouping of related runs (e.g., a multi-turn conversation)
  • Run: A single end-to-end agent invocation from user prompt to final response
  • Step: An individual LLM call within a run
  • Tool Call: A function/tool execution within a run

Runs

A run represents a single agent invocation - from receiving a user prompt to generating the final response. Each run captures:
  • User input (prompt)
  • Agent response
  • All LLM steps taken
  • All tool calls executed
  • Metadata and timing information
  • Success or error status

Creating a Run

Builder Pattern

Create and instrument a run as it executes:
import { run } from "@contextcompany/custom";

const r = run();
r.prompt("What's the weather in San Francisco?");

// ... execute your agent logic

r.response("It's 72°F and sunny in San Francisco.");
await r.end();

Factory Pattern

Send a complete run when all data is available:
import { sendRun } from "@contextcompany/custom";

await sendRun({
  prompt: { user_prompt: "What's the weather?" },
  response: "72°F in San Francisco",
  startTime: new Date("2025-01-01T00:00:00Z"),
  endTime: new Date("2025-01-01T00:00:01Z"),
});

Run Options

runId
string
Custom run identifier. If not provided, a UUID is automatically generated.
const r = run({ runId: "run_abc123" });
sessionId
string
Associates this run with a session. Runs sharing the same sessionId are grouped together in the dashboard.
const r = run({ sessionId: "session_456" });
conversational
boolean
Marks this run as part of a multi-turn conversation. When true, the dashboard displays sequential runs in the same session as a continuous thread.
const r = run({ conversational: true });
startTime
Date
Override the run start time. Defaults to new Date() at creation.
const r = run({ startTime: new Date("2025-01-01T00:00:00Z") });
timeout
number
Auto-flush timeout in milliseconds. If the run isn’t ended within this duration, it’s automatically sent with error status.
const r = run({ timeout: 600000 }); // 10 minutes
Default: 1200000 (20 minutes)Set to 0 to disable auto-flush.

Run Lifecycle

1

Create the run

const r = run({ sessionId: "session_123" });
2

Set the user prompt

r.prompt("Analyze Q3 revenue trends");
Or include a system prompt:
r.prompt({
  user_prompt: "Analyze Q3 revenue trends",
  system_prompt: "You are a financial analyst."
});
3

Execute agent logic and record events

// Record LLM steps
const s = r.step();
s.prompt(JSON.stringify(messages));
s.response(content);
s.model("gpt-4o");
s.end();

// Record tool calls
const tc = r.toolCall("query_database");
tc.args({ query: "..." });
tc.result({ rows: [...] });
tc.end();
4

Set the final response

r.response("Q3 revenue increased 15% YoY...");
5

End the run

await r.end();
This sends the run and all attached steps/tool calls in a single batch request.

Error Handling

When a run errors, use .error() instead of .end():
try {
  const result = await agent.execute(userPrompt);
  r.response(result);
  await r.end();
} catch (e) {
  await r.error(String(e));
}
Calling .error() automatically ends any un-ended child steps and tool calls with error status.

Sessions

A session groups multiple related runs together. This is useful for:
  • Multi-turn conversations
  • Related agent tasks in a workflow
  • User interactions within a time window

Creating Sessions

Sessions are created implicitly by using the same sessionId across multiple runs:
import { run } from "@contextcompany/custom";

const sessionId = crypto.randomUUID();

// First message in conversation
const r1 = run({ sessionId, conversational: true });
r1.prompt("What's the weather in SF?");
r1.response("It's 72°F and sunny.");
await r1.end();

// Follow-up question
const r2 = run({ sessionId, conversational: true });
r2.prompt("What about tomorrow?");
r2.response("Tomorrow will be 68°F with clouds.");
await r2.end();
These two runs will appear grouped together in the Observatory dashboard.

Session ID Strategies

User-based Sessions

Group all interactions from a single user:
const sessionId = `user_${userId}`;
const r = run({ sessionId, conversational: true });

Conversation-based Sessions

Create a new session for each conversation:
// When starting a new conversation
const sessionId = crypto.randomUUID();
saveToDatabase({ conversationId, sessionId });

// For each message in the conversation
const { sessionId } = loadFromDatabase({ conversationId });
const r = run({ sessionId, conversational: true });

Task-based Sessions

Group related agent tasks:
const sessionId = `workflow_${workflowId}`;

// Step 1: Research
const r1 = run({ sessionId });
r1.prompt("Research topic X");
await r1.end();

// Step 2: Analyze
const r2 = run({ sessionId });
r2.prompt("Analyze findings");
await r2.end();

Conversational vs Non-conversational

The conversational flag controls how runs are displayed:
When conversational: true, runs in the same session are displayed as a continuous conversation thread:
const r = run({ sessionId: "session_123", conversational: true });
Use this for:
  • Chat applications
  • Multi-turn dialogues
  • Sequential Q&A
When conversational is not set or false, runs are displayed as independent tasks:
const r = run({ sessionId: "session_123" });
Use this for:
  • Batch processing
  • Independent agent tasks
  • Parallel operations

Best Practices

This makes it easier to trace related runs in the dashboard:
// Good: Related runs share a session
const sessionId = crypto.randomUUID();
const r1 = run({ sessionId });
const r2 = run({ sessionId });

// Bad: No relationship tracked
const r1 = run();
const r2 = run();

Use Conversational Flag Appropriately

// Good: Chat application
const r = run({ sessionId, conversational: true });

// Good: Batch processing
const r = run({ sessionId, conversational: false });

Always Call .end() or .error()

Forgetting to end a run means the data is never sent to Observatory.
// Good: Run is ended
const r = run();
r.prompt("test");
await r.end();

// Bad: Run is never sent
const r = run();
r.prompt("test");
// Missing r.end()!

Use Auto-flush for Long-running Runs

For agents that might take a long time, configure the timeout:
// Custom timeout per run
const r = run({ timeout: 3600000 }); // 1 hour

// Or globally
configure({ runTimeout: 3600000 });
The builder pattern automatically batches all steps and tool calls with the run when you call .end(). This is more efficient than sending each event separately.
// Good: Batched in single request
const r = run();
const s1 = r.step(); s1.prompt("...").response("...").end();
const s2 = r.step(); s2.prompt("...").response("...").end();
await r.end(); // Sends run + 2 steps in one request

// Less efficient: Separate requests
await sendRun({ prompt: {...}, ... });
await sendStep({ runId: r.runId, ... });
await sendStep({ runId: r.runId, ... });

Examples

Multi-turn Chat Application

import { run } from "@contextcompany/custom";

class ChatAgent {
  private sessionId: string;
  
  constructor(userId: string) {
    this.sessionId = `chat_${userId}_${Date.now()}`;
  }
  
  async processMessage(userMessage: string): Promise<string> {
    const r = run({ 
      sessionId: this.sessionId, 
      conversational: true 
    });
    
    r.prompt(userMessage);
    
    try {
      const response = await this.generateResponse(userMessage);
      r.response(response);
      await r.end();
      return response;
    } catch (e) {
      await r.error(String(e));
      throw e;
    }
  }
  
  private async generateResponse(message: string): Promise<string> {
    // Your agent logic here
    return "Generated response";
  }
}

Batch Processing Workflow

import { run } from "@contextcompany/custom";

async function processBatch(items: string[]) {
  const sessionId = `batch_${Date.now()}`;
  
  for (const item of items) {
    const r = run({ sessionId });
    r.prompt(`Process item: ${item}`);
    r.metadata({ batch_id: sessionId, item });
    
    try {
      const result = await processItem(item);
      r.response(result);
      await r.end();
    } catch (e) {
      await r.error(String(e));
    }
  }
}

Complex Agent with Multiple Steps

import { run } from "@contextcompany/custom";

async function complexAgent(query: string) {
  const r = run({ 
    sessionId: crypto.randomUUID(),
    timeout: 600000 // 10 min for complex task
  });
  
  r.prompt(query);
  r.metadata({ agent: "research", version: "2.0" });
  
  try {
    // Step 1: Planning
    const s1 = r.step();
    s1.prompt(`Plan: ${query}`);
    const plan = await llm.call(query);
    s1.response(plan);
    s1.model("gpt-4o");
    s1.end();
    
    // Step 2: Execute tools
    const tc1 = r.toolCall("search");
    tc1.args({ query: plan });
    const searchResults = await search(plan);
    tc1.result(searchResults);
    tc1.end();
    
    // Step 3: Synthesize
    const s2 = r.step();
    s2.prompt(`Synthesize: ${searchResults}`);
    const final = await llm.call(`Synthesize: ${searchResults}`);
    s2.response(final);
    s2.model("gpt-4o");
    s2.end();
    
    r.response(final);
    await r.end();
    
    return final;
  } catch (e) {
    await r.error(String(e));
    throw e;
  }
}

Build docs developers (and LLMs) love