Python SDK - Observatory

The contextcompany Python package provides manual instrumentation for AI agent observability.

Installation

pip install contextcompany

Package Overview

run()

Track end-to-end agent executions

step()

Instrument individual LLM calls

tool_call()

Track tool and function invocations

submit_feedback()

Collect user feedback on runs

Core Functions

run()

Create a new run to track an end-to-end agent execution.

from contextcompany import run

r = run(
    run_id: Optional[str] = None,
    session_id: Optional[str] = None,
    conversational: Optional[bool] = None,
    api_key: Optional[str] = None,
    tcc_url: Optional[str] = None
)

run_id

str

Custom run identifier. Auto-generates a UUID if omitted.

session_id

str

Group multiple runs into a logical session (e.g., a conversation).

conversational

bool

Mark this run as part of a multi-turn conversation.

api_key

str

TCC API key. Defaults to TCC_API_KEY environment variable.

tcc_url

str

Custom ingestion endpoint URL.

Run Methods

prompt

method

Set the user prompt that initiated the run.

r.prompt(
    user_prompt: str,
    system_prompt: Optional[str] = None
) -> Run

Show Example

r.prompt('What is the weather in SF?')
# or with system prompt
r.prompt(
    user_prompt='Summarize this article',
    system_prompt='You are a helpful assistant'
)

response

method

Set the agent’s final response to the user.

r.response(text: str) -> Run

metadata

method

Attach key-value metadata to the run. Values must be strings.

r.metadata(
    data: Optional[Dict[str, str]] = None,
    **kwargs: str
) -> Run

Show Example

r.metadata({'agent': 'weather-bot', 'version': '1.0'})
# or use kwargs
r.metadata(agent='weather-bot', version='1.0')

status

method

Set the outcome status code and optional message.

r.status(
    code: int,
    message: Optional[str] = None
) -> Run

0 = success (default)
2 = error

step

method

Create a new Step attached to this run.

r.step(step_id: Optional[str] = None) -> Step

tool_call

method

Create a new ToolCall attached to this run.

r.tool_call(
    tool_name: Optional[str] = None,
    tool_call_id: Optional[str] = None
) -> ToolCall

feedback

method

Submit user feedback for this run.

r.feedback(
    score: Optional[Literal['thumbs_up', 'thumbs_down']] = None,
    text: Optional[str] = None
) -> bool

end

method

Finalize and send the run. Requires prompt() to have been called.

r.end() -> None

Raises RuntimeError if already ended or ValueError if prompt not set.

error

method

End the run with error status (code 2).

r.error(status_message: str = '') -> None

run_id

property

Read-only property that returns the run’s unique identifier.

print(r.run_id)  # 'run_abc123'

step()

Create a standalone step to track an individual LLM call.

from contextcompany import step

s = step(
    run_id: str,
    step_id: Optional[str] = None,
    api_key: Optional[str] = None,
    tcc_url: Optional[str] = None
)

run_id

str

required

The parent run identifier.

step_id

str

Custom step identifier. Auto-generates a UUID if omitted.

Step Methods

prompt

method

Set the prompt sent to the LLM.

s.prompt(text: str) -> Step

response

method

Set the LLM’s response text.

s.response(text: str) -> Step

model

method

Set which model was used.

s.model(
    requested: Optional[str] = None,
    used: Optional[str] = None
) -> Step

Show Example

s.model(requested='gpt-4o', used='gpt-4o-2024-08-06')
# or just one
s.model(requested='gpt-4o')

finish_reason

method

Set the model’s finish/stop reason.

s.finish_reason(reason: str) -> Step

Example values: 'stop', 'length', 'tool_calls'

tokens

method

Record token usage for this step.

s.tokens(
    prompt_uncached: Optional[int] = None,
    prompt_cached: Optional[int] = None,
    completion: Optional[int] = None
) -> Step

Show Example

s.tokens(
    prompt_uncached=120,
    prompt_cached=30,
    completion=45
)

cost

method

Set the actual cost of this step in USD.

s.cost(real_total: float) -> Step

tool_definitions

method

Set the tool definitions available during this step.

s.tool_definitions(definitions: str) -> Step

Pass a JSON string of tool schemas.

status

method

Set the outcome status code and optional message.

s.status(
    code: int,
    message: Optional[str] = None
) -> Step

tool_call

method

Create a new ToolCall attached to this step’s run.

s.tool_call(
    tool_name: Optional[str] = None,
    tool_call_id: Optional[str] = None
) -> ToolCall

end

method

Finalize and send the step. Requires both prompt() and response() to have been called.

s.end() -> None

error

method

End the step with error status (code 2).

s.error(status_message: str = '') -> None

tool_call()

Create a standalone tool call to track a tool/function invocation.

from contextcompany import tool_call

tc = tool_call(
    run_id: str,
    tool_call_id: Optional[str] = None,
    tool_name: Optional[str] = None,
    api_key: Optional[str] = None,
    tcc_url: Optional[str] = None
)

run_id

str

required

The parent run identifier.

tool_call_id

str

Custom tool call identifier. Auto-generates a UUID if omitted.

tool_name

str

The name of the tool being invoked.

ToolCall Methods

name

method

Set the tool name.

tc.name(tool_name: str) -> ToolCall

args

method

Set the arguments passed to the tool. Dicts are auto-serialized to JSON.

tc.args(value: Union[str, Dict[str, Any]]) -> ToolCall

Show Example

tc.args({'city': 'San Francisco', 'units': 'metric'})
# or as JSON string
tc.args('{"city": "San Francisco"}')

result

method

Set the return value from the tool. Dicts are auto-serialized to JSON.

tc.result(value: Union[str, Dict[str, Any]]) -> ToolCall

status

method

Set the outcome status code and optional message.

tc.status(
    code: int,
    message: Optional[str] = None
) -> ToolCall

end

method

Finalize and send the tool call. Requires name() to have been called.

tc.end() -> None

error

method

End the tool call with error status (code 2).

tc.error(status_message: str = '') -> None

submit_feedback()

Submit user feedback for a run.

from contextcompany import submit_feedback

success = submit_feedback(
    run_id: str,
    score: Optional[Literal['thumbs_up', 'thumbs_down']] = None,
    text: Optional[str] = None,
    api_key: Optional[str] = None,
    tcc_url: Optional[str] = None
) -> bool

run_id

str

required

The run identifier to attach feedback to.

score

'thumbs_up' | 'thumbs_down'

Binary feedback score.

text

str

Optional feedback text (max 2000 characters).

At least one of score or text must be provided.

Show Example

from contextcompany import submit_feedback

# Score only
submit_feedback(run_id='run_123', score='thumbs_up')

# Text only
submit_feedback(run_id='run_123', text='Great response!')

# Both
submit_feedback(
    run_id='run_123',
    score='thumbs_up',
    text='Exactly what I needed'
)

Configuration

Environment Variables

TCC_API_KEY

str

required

Your Observatory API key. Keys starting with dev_ route to development environment.

TCC_URL

str

Override the default ingestion endpoint URL.

TCC_FEEDBACK_URL

str

Override the default feedback endpoint URL.

TCC_DEBUG

str

Set to 'true' to enable debug logging.

Helper Functions

from contextcompany import get_api_key, get_url

# Get API key from environment
api_key = get_api_key(api_key: Optional[str] = None) -> str

# Get endpoint URL with dev/prod selection
url = get_url(
    prod_url: str,
    dev_url: str,
    tcc_url: Optional[str] = None,
    api_key: Optional[str] = None
) -> str

Complete Examples

Basic Run

from contextcompany import run

r = run(session_id='session_123')
r.prompt('What is the weather in San Francisco?')
r.metadata(agent='weather-bot', version='1.0')

# ... agent logic

r.response('It is 72°F and sunny.')
r.end()

Run with Steps

from contextcompany import run
import json

r = run(session_id='session_456', conversational=True)
r.prompt('Analyze this data')

# Step 1: First LLM call
s1 = r.step()
s1.prompt(json.dumps(messages))
s1.response(assistant_response)
s1.model(requested='gpt-4o', used='gpt-4o-2024-08-06')
s1.tokens(prompt_uncached=120, prompt_cached=30, completion=45)
s1.cost(0.0042)
s1.end()

# Step 2: Follow-up call
s2 = r.step()
s2.prompt(json.dumps(followup_messages))
s2.response(followup_response)
s2.model(requested='gpt-4o', used='gpt-4o-2024-08-06')
s2.tokens(prompt_uncached=80, completion=30)
s2.cost(0.0028)
s2.end()

r.response('Analysis complete.')
r.end()

Run with Tool Calls

from contextcompany import run

r = run()
r.prompt('Get the weather for San Francisco')

# Tool call
tc = r.tool_call('get_weather')
tc.args({'city': 'San Francisco', 'units': 'imperial'})

# Execute tool
weather_data = {'temp': 72, 'condition': 'sunny'}

tc.result(weather_data)
tc.end()

r.response('It is 72°F and sunny in San Francisco.')
r.end()

Error Handling

from contextcompany import run

r = run()

try:
    r.prompt('Process this data')
    # ... agent logic that might fail
    result = process_data()
    r.response(result)
    r.end()
except Exception as e:
    r.error(f'Processing failed: {str(e)}')

Feedback Collection

from contextcompany import run, submit_feedback

# Execute run
r = run()
r.prompt('Help me with this task')
r.response('Here is how to do it...')
r.end()

# Later, collect feedback
success = submit_feedback(
    run_id=r.run_id,
    score='thumbs_up',
    text='Very helpful!'
)

if success:
    print('Feedback submitted')

Standalone Step (Advanced)

from contextcompany import step
import json

# Create a step independently
s = step(run_id='run_existing_123')
s.prompt(json.dumps(messages))
s.response(response_text)
s.model(requested='claude-3-5-sonnet')
s.tokens(prompt_uncached=150, completion=60)
s.finish_reason('stop')
s.end()

LiteLLM Integration

Observatory provides a callback for LiteLLM that automatically exports each LLM call as an OpenTelemetry span.

Installation

pip install contextcompany litellm

Usage

from contextcompany.litellm import TCCCallback
from contextcompany import run
import litellm

# Configure LiteLLM with TCC callback
litellm.callbacks = [TCCCallback()]

# Create a run
r = run()
r.prompt('What is the capital of France?')

# Make LiteLLM call with run_id in metadata
response = litellm.completion(
    model='gpt-4o',
    messages=[{'role': 'user', 'content': 'What is the capital of France?'}],
    metadata={'tcc.runId': r.run_id}  # Link to run
)

r.response(response.choices[0].message.content)
r.end()

TCCCallback

from contextcompany.litellm import TCCCallback

callback = TCCCallback(
    api_key: Optional[str] = None,
    endpoint: Optional[str] = None,
    service_name: str = 'litellm'
)

api_key

str

TCC API key. Defaults to TCC_API_KEY environment variable.

endpoint

str

Custom OTLP endpoint URL. Auto-detects based on API key.

service_name

str

OpenTelemetry service name (default: 'litellm').

Linking LLM Calls to Runs

Pass the run ID in the metadata parameter:

# Option 1: tcc.runId
response = litellm.completion(
    model='gpt-4',
    messages=[...],
    metadata={'tcc.runId': r.run_id}
)

# Option 2: tcc.run_id (snake_case)
response = litellm.completion(
    model='gpt-4',
    messages=[...],
    metadata={'tcc.run_id': r.run_id}
)

Best Practices

Session Tracking

Group related runs into sessions for multi-turn conversations:

from contextcompany import run
import uuid

session_id = str(uuid.uuid4())

# Turn 1
r1 = run(session_id=session_id, conversational=True)
r1.prompt('Hello')
r1.response('Hi! How can I help?')
r1.end()

# Turn 2 (same session)
r2 = run(session_id=session_id, conversational=True)
r2.prompt('What is the weather?')
r2.response('It is sunny.')
r2.end()

Metadata for Filtering

Use metadata to add searchable context:

r.metadata(
    user_id='user_123',
    agent_version='2.1.0',
    environment='production',
    feature_flag='new-model'
)

Token Tracking

Always report token usage for cost tracking:

s.tokens(
    prompt_uncached=response.usage.prompt_tokens,
    completion=response.usage.completion_tokens
)

Error Reporting

Use .error() instead of .end() when failures occur:

try:
    # agent logic
    r.end()
except ValidationError as e:
    r.error(f'Validation failed: {e}')
except Exception as e:
    r.error(f'Unexpected error: {e}')

Next Steps

TypeScript SDKs

TypeScript SDK reference documentation

Quickstart

Get started with Observatory in 5 minutes

Python API

Complete Python API reference

Configuration

Configuration guide

Get Started

Frameworks

SDKs

Features

Guides

​Installation

​Package Overview

run()

step()

tool_call()

submit_feedback()

​Core Functions

​run()

​Run Methods

​step()

​Step Methods

​tool_call()

​ToolCall Methods

​submit_feedback()

​Configuration

​Environment Variables

​Helper Functions

​Complete Examples

​Basic Run

​Run with Steps

​Run with Tool Calls

​Error Handling

​Feedback Collection

​Standalone Step (Advanced)

​LiteLLM Integration

​Installation

​Usage

​TCCCallback

​Linking LLM Calls to Runs

​Best Practices

​Session Tracking

​Metadata for Filtering

​Token Tracking

​Error Reporting

​Next Steps

TypeScript SDKs

Quickstart

Python API

Configuration

Build docs developers (and LLMs) love

Installation

Package Overview

Core Functions

run()

Run Methods

step()

Step Methods

tool_call()

ToolCall Methods

submit_feedback()

Configuration

Environment Variables

Helper Functions

Complete Examples

Basic Run

Run with Steps

Run with Tool Calls

Error Handling

Feedback Collection

Standalone Step (Advanced)

LiteLLM Integration

Installation

Usage

TCCCallback

Linking LLM Calls to Runs

Best Practices

Session Tracking

Metadata for Filtering

Token Tracking

Error Reporting

Next Steps