Skip to main content

Overview

The Observatory Python SDK provides automatic instrumentation for LiteLLM via the TCCCallback class. This callback exports each LLM call as an OpenTelemetry span, allowing you to track LiteLLM completions without manual instrumentation.

Installation

Install the SDK with LiteLLM support:
pip install contextcompany[litellm]
This installs:
  • litellm
  • opentelemetry-sdk
  • opentelemetry-exporter-otlp-proto-http

Basic Setup

Register the callback with LiteLLM:
from contextcompany.litellm import TCCCallback
import litellm

# Set up the callback
litellm.callbacks = [TCCCallback()]

# Make LLM calls as usual
response = litellm.completion(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

Linking to Runs

To associate LiteLLM calls with Observatory runs, pass the run ID in metadata:
from contextcompany import run
from contextcompany.litellm import TCCCallback
import litellm

litellm.callbacks = [TCCCallback()]

# Create a run
r = run()

# Link LiteLLM calls to this run
response = litellm.completion(
    model="gpt-4",
    messages=[{"role": "user", "content": "What's the weather?"}],
    metadata={"tcc.runId": r.run_id}
)

r.prompt(user_prompt="What's the weather?")
r.response(response.choices[0].message.content)
r.end()

Class Reference

TCCCallback

from contextcompany.litellm import TCCCallback

TCCCallback(
    api_key: Optional[str] = None,
    endpoint: Optional[str] = None,
    service_name: str = "litellm"
)

Parameters

api_key
str
Observatory API key. If not provided, uses the TCC_API_KEY environment variable.
endpoint
str
Custom OpenTelemetry endpoint URL. If not provided, uses the default Observatory OTEL endpoint.
service_name
str
default:"litellm"
Service name for OpenTelemetry traces.

What Gets Tracked

The callback automatically captures:
  • Model Information: Requested model and actual model used
  • Messages: Input messages and output content
  • Token Usage: Input tokens, output tokens, and cached tokens
  • Tool Calls: Function/tool invocations and their arguments
  • Finish Reasons: Why the LLM stopped generating (stop, length, tool_calls, etc.)
  • Run Association: Links to Observatory runs via metadata.tcc.runId
  • Errors: Failed LLM calls with error messages

Metadata Keys

Use these metadata keys to control tracking:
tcc.runId
str
Associate this LLM call with an Observatory run. Use r.run_id from your run object.
tcc.run_id
str
Alternative key name for run association (snake_case variant).

Usage Examples

Basic Chat Completion

from contextcompany.litellm import TCCCallback
import litellm

litellm.callbacks = [TCCCallback()]

response = litellm.completion(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ]
)

print(response.choices[0].message.content)

With Run Tracking

from contextcompany import run
from contextcompany.litellm import TCCCallback
import litellm

litellm.callbacks = [TCCCallback()]

r = run()

response = litellm.completion(
    model="gpt-4",
    messages=[{"role": "user", "content": "Tell me a joke"}],
    metadata={"tcc.runId": r.run_id}
)

r.prompt(user_prompt="Tell me a joke")
r.response(response.choices[0].message.content)
r.end()

Function Calling

from contextcompany import run
from contextcompany.litellm import TCCCallback
import litellm
import json

litellm.callbacks = [TCCCallback()]

r = run()

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }
]

response = litellm.completion(
    model="gpt-4",
    messages=[{"role": "user", "content": "What's the weather in NYC?"}],
    tools=tools,
    metadata={"tcc.runId": r.run_id}
)

tool_call = response.choices[0].message.tool_calls[0]
print(f"Calling {tool_call.function.name} with {tool_call.function.arguments}")

r.prompt(user_prompt="What's the weather in NYC?")
r.response(f"Need to call {tool_call.function.name}")
r.end()

Streaming Responses

from contextcompany import run
from contextcompany.litellm import TCCCallback
import litellm

litellm.callbacks = [TCCCallback()]

r = run()

response = litellm.completion(
    model="gpt-4",
    messages=[{"role": "user", "content": "Write a short story"}],
    stream=True,
    metadata={"tcc.runId": r.run_id}
)

full_response = ""
for chunk in response:
    content = chunk.choices[0].delta.content or ""
    full_response += content
    print(content, end="", flush=True)

print()  # New line

r.prompt(user_prompt="Write a short story")
r.response(full_response)
r.end()

Multiple Models

from contextcompany import run
from contextcompany.litellm import TCCCallback
import litellm

litellm.callbacks = [TCCCallback()]

r = run()

# First call with GPT-4
response1 = litellm.completion(
    model="gpt-4",
    messages=[{"role": "user", "content": "Explain AI"}],
    metadata={"tcc.runId": r.run_id}
)

# Second call with Claude
response2 = litellm.completion(
    model="claude-3-opus-20240229",
    messages=[{"role": "user", "content": "Summarize the above"}],
    metadata={"tcc.runId": r.run_id}
)

r.prompt(user_prompt="Explain AI and summarize")
r.response(response2.choices[0].message.content)
r.end()

Custom Endpoint

from contextcompany.litellm import TCCCallback
import litellm

# Use a custom Observatory instance
callback = TCCCallback(
    api_key="your_api_key",
    endpoint="https://custom.example.com/otel-steps",
    service_name="my-agent"
)

litellm.callbacks = [callback]

response = litellm.completion(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

Best Practices

  1. Set up callback once: Initialize TCCCallback at application startup and reuse it for all LiteLLM calls.
  2. Always pass run ID: Include metadata={"tcc.runId": r.run_id} to link LLM calls to runs.
  3. Handle streaming: For streaming responses, collect the full response before calling r.response().
  4. Track function calls separately: Use tool_call() to track tool invocations alongside LiteLLM’s automatic step tracking.
  5. Use environment variables: Set TCC_API_KEY instead of hardcoding credentials.

Environment Variables

TCC_API_KEY
string
required
Your Observatory API key
TCC_URL
string
Custom Observatory endpoint URL (for the OTEL exporter)

Comparison: Manual vs Auto-instrumentation

Manual Instrumentation

from contextcompany import run
import openai

r = run()
s = r.step()

response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

s.prompt("Hello!")
s.response(response.choices[0].message.content)
s.model(requested="gpt-4", used=response.model)
s.tokens(
    prompt_uncached=response.usage.prompt_tokens,
    completion=response.usage.completion_tokens
)
s.end()

r.prompt(user_prompt="Hello!")
r.response(response.choices[0].message.content)
r.end()

Auto-instrumentation with LiteLLM

from contextcompany import run
from contextcompany.litellm import TCCCallback
import litellm

litellm.callbacks = [TCCCallback()]

r = run()

# Steps are automatically created and tracked
response = litellm.completion(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}],
    metadata={"tcc.runId": r.run_id}
)

r.prompt(user_prompt="Hello!")
r.response(response.choices[0].message.content)
r.end()
Auto-instrumentation reduces boilerplate and ensures consistent tracking across all LiteLLM calls.

Limitations

  • LiteLLM only: This callback only works with LiteLLM. For other LLM libraries, use manual instrumentation with step().
  • Run association required: You must pass metadata.tcc.runId to link LLM calls to runs.
  • OTEL overhead: OpenTelemetry adds some performance overhead compared to direct HTTP calls.

Troubleshooting

Spans not appearing in Observatory

  1. Verify your API key is set: echo $TCC_API_KEY
  2. Check that metadata includes tcc.runId: metadata={"tcc.runId": r.run_id}
  3. Ensure the callback is registered: litellm.callbacks = [TCCCallback()]

Missing token counts

Some models don’t return token usage. LiteLLM will estimate tokens when possible.

Import errors

Make sure you installed with the litellm extra:
pip install contextcompany[litellm]

See Also

Build docs developers (and LLMs) love