Skip to main content
Observatory’s custom instrumentation SDK provides full control over what and how you instrument your AI agents. Use this for custom agents, in-house frameworks, or any stack not covered by automatic integrations.

Overview

The @contextcompany/custom package offers two instrumentation patterns:
  1. Builder Pattern: Instrument as you go during live agent execution
  2. Factory Pattern: Send pre-built data from logs or post-hoc analysis
Both patterns support tracking:
  • Agent runs and responses
  • LLM steps (prompt, response, tokens, cost)
  • Tool calls and results
  • Sessions for conversational agents
  • Custom metadata
  • Error handling

Installation

1

Install the package

npm install @contextcompany/custom
2

Set your API key

Add your Observatory API key to your environment variables:
.env
TCC_API_KEY=your_api_key_here
Or configure programmatically:
import { configure } from '@contextcompany/custom';

configure({ apiKey: 'your_api_key' });

Builder Pattern

Use the builder pattern when wrapping live agent execution. Steps and tool calls are batched and sent when you call run.end().

Basic Usage

import { run } from '@contextcompany/custom';

// Create a run
const r = run({
  sessionId: 'session-abc',
  conversational: true,
});

// Set the user prompt
r.prompt('What is the weather in San Francisco?');

// Record an LLM step
const step = r.step();
step.prompt(JSON.stringify(messages));
step.response('It is currently 72°F and sunny in San Francisco.');
step.model('gpt-4o');
step.tokens({ uncached: 120, cached: 30, completion: 45 });
step.end();

// Set the final response and send
r.response('It is 72°F in San Francisco.');
await r.end();

With Tool Calls

import { run } from '@contextcompany/custom';

const r = run({ sessionId: 'session-123' });
r.prompt('What is the weather in Tokyo?');

// LLM decides to call a tool
const step1 = r.step();
step1.prompt(JSON.stringify(messages));
step1.response('I need to call get_weather');
step1.model('gpt-4');
step1.end();

// Record the tool call
const tc = r.toolCall('get_weather');
tc.args({ city: 'Tokyo', unit: 'celsius' });
tc.result({ temperature: 22, conditions: 'cloudy' });
tc.end();

// LLM processes tool result
const step2 = r.step();
step2.prompt(JSON.stringify(messagesWithToolResult));
step2.response('It is 22°C and cloudy in Tokyo.');
step2.model('gpt-4');
step2.end();

// Finalize
r.response('It is 22°C and cloudy in Tokyo.');
await r.end();

Error Handling

import { run } from '@contextcompany/custom';

const r = run();
r.prompt('Translate this to French');

try {
  const step = r.step();
  step.prompt('Translate...');
  
  const result = await callLLM();
  step.response(result);
  step.end();
  
  r.response(result);
  await r.end();
} catch (error) {
  // Auto-ends any un-ended children with error status
  await r.error(String(error));
}

With Metadata

const r = run({
  sessionId: 'session-456',
  conversational: true,
});

// Add custom metadata
r.metadata({
  userId: 'user_789',
  environment: 'production',
  version: '2.1.0',
  model_requested: 'gpt-4-turbo',
});

r.prompt('Hello!');

const step = r.step();
step.prompt('Hello!');
step.response('Hi there!');
step.model('gpt-4-turbo-2024-04-09');
step.end();

r.response('Hi there!');
await r.end();

Factory Pattern

Use the factory pattern when all data is already available—perfect for post-hoc logging, batch imports, or replaying from logs.

Send Complete Run

import { sendRun } from '@contextcompany/custom';

await sendRun({
  prompt: {
    user_prompt: 'What is the capital of France?',
    system_prompt: 'You are a helpful assistant.',
  },
  response: 'The capital of France is Paris.',
  startTime: new Date('2025-01-01T00:00:00Z'),
  endTime: new Date('2025-01-01T00:00:02Z'),
  metadata: {
    userId: 'user_123',
    source: 'web-app',
  },
  steps: [
    {
      prompt: JSON.stringify([{ role: 'user', content: 'What is the capital of France?' }]),
      response: 'The capital of France is Paris.',
      model: 'gpt-4o',
      tokens: { uncached: 100, completion: 20 },
      startTime: new Date('2025-01-01T00:00:00Z'),
      endTime: new Date('2025-01-01T00:00:01Z'),
    },
  ],
});

Send with Tool Calls

import { sendRun } from '@contextcompany/custom';

await sendRun({
  prompt: 'What is the weather in London?',
  response: 'It is 15°C and rainy in London.',
  startTime: new Date(),
  endTime: new Date(),
  steps: [
    {
      prompt: JSON.stringify(messages),
      response: 'Let me check the weather',
      model: 'gpt-4',
      tokens: { uncached: 80, completion: 15 },
      startTime: new Date(),
      endTime: new Date(),
    },
  ],
  toolCalls: [
    {
      name: 'get_weather',
      args: { city: 'London', unit: 'celsius' },
      result: { temperature: 15, conditions: 'rainy' },
      startTime: new Date(),
      endTime: new Date(),
    },
  ],
});

Send Individual Components

You can also send steps and tool calls independently:
import { sendStep, sendToolCall } from '@contextcompany/custom';

// Send a step
await sendStep({
  runId: 'run_abc123',
  prompt: JSON.stringify(messages),
  response: 'Here is my answer',
  model: { requested: 'gpt-4o', used: 'gpt-4o-2024-08-06' },
  tokens: { uncached: 120, cached: 30, completion: 45 },
  cost: 0.0042,
  startTime: new Date(),
  endTime: new Date(),
});

// Send a tool call
await sendToolCall({
  runId: 'run_abc123',
  name: 'search',
  args: { query: 'SF weather' },
  result: { temperature: 72 },
  startTime: new Date(),
  endTime: new Date(),
});

API Reference

run(options?)

Create a new run builder. Options:
runId
string
Custom run ID. Auto-generated if not provided.
sessionId
string
Session ID for grouping related runs.
conversational
boolean
Mark this run as part of a conversation.
startTime
Date
default:"new Date()"
Custom start time for the run.
timeout
number
default:"1200000"
Timeout in milliseconds (default: 20 minutes).
Run Methods:
MethodDescription
.prompt(text)Set the user prompt (required). Pass a string or { user_prompt, system_prompt }.
.response(text)Set the agent response.
.metadata({ key: 'val' })Attach custom metadata.
.status(code, message?)Set status (0 = success, 2 = error).
.endTime(date)Set a custom end time.
.step(idOrOptions?)Create an attached Step.
.toolCall(nameOrOptions?)Create an attached ToolCall.
.end()Finalize and send (returns Promise<void>).
.error(message?)End with error status (returns Promise<void>).

Step Builder

Created via run.step().
MethodDescription
.prompt(text)Set the LLM prompt (required).
.response(text)Set the LLM response (required).
.model('gpt-4o')Set model name (shorthand).
.model({ requested, used })Set model with requested/used distinction.
.tokens({ uncached, cached, completion })Set token counts.
.cost(amount)Set cost in USD.
.finishReason(reason)Set finish reason.
.toolDefinitions(defs)Set tool definitions (string or array).
.status(code, message?)Set status code.
.endTime(date)Set custom end time.
.end()Mark step as complete.
.error(message?)Mark step as errored.

ToolCall Builder

Created via run.toolCall().
MethodDescription
.name(toolName)Set the tool name (required).
.args(value)Set args (string or object, auto-serialized).
.result(value)Set result (string or object, auto-serialized).
.status(code, message?)Set status code.
.endTime(date)Set custom end time.
.end()Mark tool call as complete.
.error(message?)Mark tool call as errored.

Factory Functions

sendRun(input)

Send a complete run with optional nested steps and tool calls. Input Fields:
prompt
string | { user_prompt: string, system_prompt?: string }
required
The user prompt or prompt object.
response
string
The agent’s response.
startTime
Date
required
Run start time.
endTime
Date
required
Run end time.
metadata
Record<string, any>
Custom metadata.
steps
Array<StepInput>
Array of step objects.
toolCalls
Array<ToolCallInput>
Array of tool call objects.

sendStep(input)

Send a single step (requires runId).

sendToolCall(input)

Send a single tool call (requires runId).

Configuration

import { configure } from '@contextcompany/custom';

configure({
  apiKey: 'your_api_key',  // Overrides TCC_API_KEY env var
  debug: true,              // Overrides TCC_DEBUG env var
});

Feedback

import { submitFeedback } from '@contextcompany/custom';

await submitFeedback({
  runId: 'run_abc123',
  score: 'thumbs_up', // or 'thumbs_down'
});

Environment Variables

TCC_API_KEY
string
required
Your Observatory API key. Required unless set via configure().
TCC_URL
string
Custom ingestion endpoint. Only needed for self-hosted instances.
TCC_DEBUG
string
Set to 1 or true to enable debug logging.

Best Practices

When to Use Builder Pattern

✅ Use builder pattern when:
  • Wrapping live agent execution
  • You don’t know the full run data upfront
  • Steps and tools happen sequentially
  • You want automatic error handling

When to Use Factory Pattern

✅ Use factory pattern when:
  • Replaying from logs
  • Importing historical data
  • All data is already available
  • You need precise control over timestamps

Structuring Metadata

// Good: structured, searchable
r.metadata({
  userId: 'user_123',
  environment: 'production',
  modelFamily: 'gpt-4',
  feature: 'chat',
});

// Avoid: unstructured, hard to query
r.metadata({
  info: 'user_123 in production using gpt-4 for chat',
});

Handling Long-Running Agents

For agents that run longer than 20 minutes, adjust the timeout:
const r = run({
  timeout: 3600000, // 1 hour
});

Examples

LangChain Integration

import { ChatOpenAI } from 'langchain/chat_models/openai';
import { HumanMessage } from 'langchain/schema';
import { run } from '@contextcompany/custom';

async function runAgent(userPrompt: string) {
  const r = run();
  r.prompt(userPrompt);

  const model = new ChatOpenAI({ modelName: 'gpt-4' });
  const step = r.step();
  
  try {
    step.prompt(userPrompt);
    const response = await model.call([new HumanMessage(userPrompt)]);
    step.response(response.content);
    step.model('gpt-4');
    step.end();

    r.response(response.content);
    await r.end();
    
    return response.content;
  } catch (error) {
    await r.error(String(error));
    throw error;
  }
}

Custom Multi-Step Agent

import { run } from '@contextcompany/custom';

async function multiStepAgent(query: string) {
  const r = run({ sessionId: crypto.randomUUID() });
  r.prompt(query);

  // Step 1: Planning
  const planStep = r.step();
  planStep.prompt(`Plan how to answer: ${query}`);
  const plan = await llm.generate('Create a plan...');
  planStep.response(plan);
  planStep.model('gpt-4');
  planStep.end();

  // Step 2: Research (with tool)
  const researchStep = r.step();
  researchStep.prompt(`Research: ${plan}`);
  
  const searchTool = r.toolCall('web_search');
  searchTool.args({ query: extractQuery(plan) });
  const searchResults = await search(extractQuery(plan));
  searchTool.result(searchResults);
  searchTool.end();

  const research = await llm.generate(`Synthesize: ${searchResults}`);
  researchStep.response(research);
  researchStep.model('gpt-4');
  researchStep.end();

  // Step 3: Final answer
  const answerStep = r.step();
  answerStep.prompt(`Answer based on: ${research}`);
  const answer = await llm.generate(`Final answer: ${research}`);
  answerStep.response(answer);
  answerStep.model('gpt-4');
  answerStep.end();

  r.response(answer);
  await r.end();

  return answer;
}

Troubleshooting

  1. Verify TCC_API_KEY is set
  2. Enable debug mode:
    configure({ debug: true });
    
  3. Check console for errors
  4. Ensure you called await r.end() (builder pattern)
Common issues:
  • Missing required fields (prompt, response for steps)
  • Invalid dates (must be Date objects, not strings)
  • Non-serializable metadata (no functions or circular refs)
If your agent runs longer than 20 minutes:
const r = run({ timeout: 3600000 }); // 1 hour
Ensure steps are properly ended:
const step = r.step();
// ... set data ...
step.end(); // Don't forget this!

Next Steps

Configuration

Learn about configuration options

Sessions

Learn about tracking conversational sessions

Feedback

Set up feedback collection

API Reference

Complete API documentation

Build docs developers (and LLMs) love