Custom Instrumentation

Observatory’s custom instrumentation SDK provides full control over what and how you instrument your AI agents. Use this for custom agents, in-house frameworks, or any stack not covered by automatic integrations.

Overview

The @contextcompany/custom package offers two instrumentation patterns:

Builder Pattern: Instrument as you go during live agent execution
Factory Pattern: Send pre-built data from logs or post-hoc analysis

Both patterns support tracking:

Agent runs and responses
LLM steps (prompt, response, tokens, cost)
Tool calls and results
Sessions for conversational agents
Custom metadata
Error handling

Installation

Install the package

npm install @contextcompany/custom

Set your API key

Add your Observatory API key to your environment variables:

.env

TCC_API_KEY=your_api_key_here

Or configure programmatically:

import { configure } from '@contextcompany/custom';

configure({ apiKey: 'your_api_key' });

Builder Pattern

Use the builder pattern when wrapping live agent execution. Steps and tool calls are batched and sent when you call run.end().

Basic Usage

import { run } from '@contextcompany/custom';

// Create a run
const r = run({
  sessionId: 'session-abc',
  conversational: true,
});

// Set the user prompt
r.prompt('What is the weather in San Francisco?');

// Record an LLM step
const step = r.step();
step.prompt(JSON.stringify(messages));
step.response('It is currently 72°F and sunny in San Francisco.');
step.model('gpt-4o');
step.tokens({ uncached: 120, cached: 30, completion: 45 });
step.end();

// Set the final response and send
r.response('It is 72°F in San Francisco.');
await r.end();

With Tool Calls

import { run } from '@contextcompany/custom';

const r = run({ sessionId: 'session-123' });
r.prompt('What is the weather in Tokyo?');

// LLM decides to call a tool
const step1 = r.step();
step1.prompt(JSON.stringify(messages));
step1.response('I need to call get_weather');
step1.model('gpt-4');
step1.end();

// Record the tool call
const tc = r.toolCall('get_weather');
tc.args({ city: 'Tokyo', unit: 'celsius' });
tc.result({ temperature: 22, conditions: 'cloudy' });
tc.end();

// LLM processes tool result
const step2 = r.step();
step2.prompt(JSON.stringify(messagesWithToolResult));
step2.response('It is 22°C and cloudy in Tokyo.');
step2.model('gpt-4');
step2.end();

// Finalize
r.response('It is 22°C and cloudy in Tokyo.');
await r.end();

Error Handling

import { run } from '@contextcompany/custom';

const r = run();
r.prompt('Translate this to French');

try {
  const step = r.step();
  step.prompt('Translate...');
  
  const result = await callLLM();
  step.response(result);
  step.end();
  
  r.response(result);
  await r.end();
} catch (error) {
  // Auto-ends any un-ended children with error status
  await r.error(String(error));
}

With Metadata

const r = run({
  sessionId: 'session-456',
  conversational: true,
});

// Add custom metadata
r.metadata({
  userId: 'user_789',
  environment: 'production',
  version: '2.1.0',
  model_requested: 'gpt-4-turbo',
});

r.prompt('Hello!');

const step = r.step();
step.prompt('Hello!');
step.response('Hi there!');
step.model('gpt-4-turbo-2024-04-09');
step.end();

r.response('Hi there!');
await r.end();

Factory Pattern

Use the factory pattern when all data is already available—perfect for post-hoc logging, batch imports, or replaying from logs.

Send Complete Run

import { sendRun } from '@contextcompany/custom';

await sendRun({
  prompt: {
    user_prompt: 'What is the capital of France?',
    system_prompt: 'You are a helpful assistant.',
  },
  response: 'The capital of France is Paris.',
  startTime: new Date('2025-01-01T00:00:00Z'),
  endTime: new Date('2025-01-01T00:00:02Z'),
  metadata: {
    userId: 'user_123',
    source: 'web-app',
  },
  steps: [
    {
      prompt: JSON.stringify([{ role: 'user', content: 'What is the capital of France?' }]),
      response: 'The capital of France is Paris.',
      model: 'gpt-4o',
      tokens: { uncached: 100, completion: 20 },
      startTime: new Date('2025-01-01T00:00:00Z'),
      endTime: new Date('2025-01-01T00:00:01Z'),
    },
  ],
});

Send with Tool Calls

import { sendRun } from '@contextcompany/custom';

await sendRun({
  prompt: 'What is the weather in London?',
  response: 'It is 15°C and rainy in London.',
  startTime: new Date(),
  endTime: new Date(),
  steps: [
    {
      prompt: JSON.stringify(messages),
      response: 'Let me check the weather',
      model: 'gpt-4',
      tokens: { uncached: 80, completion: 15 },
      startTime: new Date(),
      endTime: new Date(),
    },
  ],
  toolCalls: [
    {
      name: 'get_weather',
      args: { city: 'London', unit: 'celsius' },
      result: { temperature: 15, conditions: 'rainy' },
      startTime: new Date(),
      endTime: new Date(),
    },
  ],
});

Send Individual Components

You can also send steps and tool calls independently:

import { sendStep, sendToolCall } from '@contextcompany/custom';

// Send a step
await sendStep({
  runId: 'run_abc123',
  prompt: JSON.stringify(messages),
  response: 'Here is my answer',
  model: { requested: 'gpt-4o', used: 'gpt-4o-2024-08-06' },
  tokens: { uncached: 120, cached: 30, completion: 45 },
  cost: 0.0042,
  startTime: new Date(),
  endTime: new Date(),
});

// Send a tool call
await sendToolCall({
  runId: 'run_abc123',
  name: 'search',
  args: { query: 'SF weather' },
  result: { temperature: 72 },
  startTime: new Date(),
  endTime: new Date(),
});

API Reference

`run(options?)`

Create a new run builder. Options:

runId

string

Custom run ID. Auto-generated if not provided.

sessionId

string

Session ID for grouping related runs.

conversational

boolean

Mark this run as part of a conversation.

startTime

Date

default:"new Date()"

Custom start time for the run.

timeout

number

default:"1200000"

Timeout in milliseconds (default: 20 minutes).

Run Methods:

Method	Description
`.prompt(text)`	Set the user prompt (required). Pass a string or `{ user_prompt, system_prompt }`.
`.response(text)`	Set the agent response.
`.metadata({ key: 'val' })`	Attach custom metadata.
`.status(code, message?)`	Set status (0 = success, 2 = error).
`.endTime(date)`	Set a custom end time.
`.step(idOrOptions?)`	Create an attached Step.
`.toolCall(nameOrOptions?)`	Create an attached ToolCall.
`.end()`	Finalize and send (returns `Promise<void>`).
`.error(message?)`	End with error status (returns `Promise<void>`).

Step Builder

Created via run.step().

Method	Description
`.prompt(text)`	Set the LLM prompt (required).
`.response(text)`	Set the LLM response (required).
`.model('gpt-4o')`	Set model name (shorthand).
`.model({ requested, used })`	Set model with requested/used distinction.
`.tokens({ uncached, cached, completion })`	Set token counts.
`.cost(amount)`	Set cost in USD.
`.finishReason(reason)`	Set finish reason.
`.toolDefinitions(defs)`	Set tool definitions (string or array).
`.status(code, message?)`	Set status code.
`.endTime(date)`	Set custom end time.
`.end()`	Mark step as complete.
`.error(message?)`	Mark step as errored.

ToolCall Builder

Created via run.toolCall().

Method	Description
`.name(toolName)`	Set the tool name (required).
`.args(value)`	Set args (string or object, auto-serialized).
`.result(value)`	Set result (string or object, auto-serialized).
`.status(code, message?)`	Set status code.
`.endTime(date)`	Set custom end time.
`.end()`	Mark tool call as complete.
`.error(message?)`	Mark tool call as errored.

Factory Functions

`sendRun(input)`

Send a complete run with optional nested steps and tool calls. Input Fields:

prompt

string | { user_prompt: string, system_prompt?: string }

required

The user prompt or prompt object.

response

string

The agent’s response.

startTime

Date

required

Run start time.

endTime

Date

required

Run end time.

metadata

Record<string, any>

Custom metadata.

steps

Array<StepInput>

Array of step objects.

toolCalls

Array<ToolCallInput>

Array of tool call objects.

`sendStep(input)`

Send a single step (requires runId).

`sendToolCall(input)`

Send a single tool call (requires runId).

Configuration

import { configure } from '@contextcompany/custom';

configure({
  apiKey: 'your_api_key',  // Overrides TCC_API_KEY env var
  debug: true,              // Overrides TCC_DEBUG env var
});

Feedback

import { submitFeedback } from '@contextcompany/custom';

await submitFeedback({
  runId: 'run_abc123',
  score: 'thumbs_up', // or 'thumbs_down'
});

Environment Variables

TCC_API_KEY

string

required

Your Observatory API key. Required unless set via configure().

TCC_URL

string

Custom ingestion endpoint. Only needed for self-hosted instances.

TCC_DEBUG

string

Set to 1 or true to enable debug logging.

Best Practices

When to Use Builder Pattern

✅ Use builder pattern when:

Wrapping live agent execution
You don’t know the full run data upfront
Steps and tools happen sequentially
You want automatic error handling

When to Use Factory Pattern

✅ Use factory pattern when:

Replaying from logs
Importing historical data
All data is already available
You need precise control over timestamps

Structuring Metadata

// Good: structured, searchable
r.metadata({
  userId: 'user_123',
  environment: 'production',
  modelFamily: 'gpt-4',
  feature: 'chat',
});

// Avoid: unstructured, hard to query
r.metadata({
  info: 'user_123 in production using gpt-4 for chat',
});

Handling Long-Running Agents

For agents that run longer than 20 minutes, adjust the timeout:

const r = run({
  timeout: 3600000, // 1 hour
});

Examples

LangChain Integration

import { ChatOpenAI } from 'langchain/chat_models/openai';
import { HumanMessage } from 'langchain/schema';
import { run } from '@contextcompany/custom';

async function runAgent(userPrompt: string) {
  const r = run();
  r.prompt(userPrompt);

  const model = new ChatOpenAI({ modelName: 'gpt-4' });
  const step = r.step();
  
  try {
    step.prompt(userPrompt);
    const response = await model.call([new HumanMessage(userPrompt)]);
    step.response(response.content);
    step.model('gpt-4');
    step.end();

    r.response(response.content);
    await r.end();
    
    return response.content;
  } catch (error) {
    await r.error(String(error));
    throw error;
  }
}

Custom Multi-Step Agent

import { run } from '@contextcompany/custom';

async function multiStepAgent(query: string) {
  const r = run({ sessionId: crypto.randomUUID() });
  r.prompt(query);

  // Step 1: Planning
  const planStep = r.step();
  planStep.prompt(`Plan how to answer: ${query}`);
  const plan = await llm.generate('Create a plan...');
  planStep.response(plan);
  planStep.model('gpt-4');
  planStep.end();

  // Step 2: Research (with tool)
  const researchStep = r.step();
  researchStep.prompt(`Research: ${plan}`);
  
  const searchTool = r.toolCall('web_search');
  searchTool.args({ query: extractQuery(plan) });
  const searchResults = await search(extractQuery(plan));
  searchTool.result(searchResults);
  searchTool.end();

  const research = await llm.generate(`Synthesize: ${searchResults}`);
  researchStep.response(research);
  researchStep.model('gpt-4');
  researchStep.end();

  // Step 3: Final answer
  const answerStep = r.step();
  answerStep.prompt(`Answer based on: ${research}`);
  const answer = await llm.generate(`Final answer: ${research}`);
  answerStep.response(answer);
  answerStep.model('gpt-4');
  answerStep.end();

  r.response(answer);
  await r.end();

  return answer;
}

Troubleshooting

Data not appearing

Verify TCC_API_KEY is set
Enable debug mode:
```
configure({ debug: true });
```
Check console for errors
Ensure you called await r.end() (builder pattern)

Validation errors

Common issues:

Missing required fields (prompt, response for steps)
Invalid dates (must be Date objects, not strings)
Non-serializable metadata (no functions or circular refs)

Builder timeout errors

If your agent runs longer than 20 minutes:

const r = run({ timeout: 3600000 }); // 1 hour

Steps not showing

Ensure steps are properly ended:

const step = r.step();
// ... set data ...
step.end(); // Don't forget this!

Next Steps

Configuration

Learn about configuration options

Sessions

Learn about tracking conversational sessions

Feedback

Set up feedback collection

API Reference

Complete API documentation

Get Started

Frameworks

SDKs

Features

Guides

​Overview

​Installation

​Builder Pattern

​Basic Usage

​With Tool Calls

​Error Handling

​With Metadata

​Factory Pattern

​Send Complete Run

​Send with Tool Calls

​Send Individual Components

​API Reference

​run(options?)

​Step Builder

​ToolCall Builder

​Factory Functions

​sendRun(input)

​sendStep(input)

​sendToolCall(input)

​Configuration

​Feedback

​Environment Variables

​Best Practices

​When to Use Builder Pattern

​When to Use Factory Pattern

​Structuring Metadata

​Handling Long-Running Agents

​Examples

​LangChain Integration

​Custom Multi-Step Agent

​Troubleshooting

​Next Steps

Configuration

Sessions

Feedback

API Reference

Build docs developers (and LLMs) love

Overview

Installation

Builder Pattern

Basic Usage

With Tool Calls

Error Handling

With Metadata

Factory Pattern

Send Complete Run

Send with Tool Calls

Send Individual Components

API Reference

`run(options?)`

Step Builder

ToolCall Builder

Factory Functions

`sendRun(input)`

`sendStep(input)`

`sendToolCall(input)`

Configuration

Feedback

Environment Variables

Best Practices

When to Use Builder Pattern

When to Use Factory Pattern

Structuring Metadata

Handling Long-Running Agents

Examples

LangChain Integration

Custom Multi-Step Agent

Troubleshooting

Next Steps