Overview
The step() function creates a new step to track an individual LLM call within a run. Steps capture detailed information about model interactions, including prompts, responses, token usage, costs, and tool definitions.
Function Signature
from contextcompany import step
step(
run_id: str,
step_id: Optional[str] = None,
api_key: Optional[str] = None,
tcc_url: Optional[str] = None,
) -> Step
Parameters
The run ID that this step belongs to. Get this from r.run_id.
Unique identifier for this step. If not provided, a UUID will be automatically generated.
Observatory API key. If not provided, uses the TCC_API_KEY environment variable.
Custom Observatory endpoint URL. If not provided, uses the TCC_URL environment variable or defaults to production.
Returns
Returns a Step object with the following methods:
prompt()
Set the prompt sent to the LLM:
s.prompt(text: str) -> Step
The prompt text or serialized messages array sent to the LLM
response()
Set the response from the LLM:
s.response(text: str) -> Step
The response text from the LLM
model()
Set the model information:
s.model(
requested: Optional[str] = None,
used: Optional[str] = None
) -> Step
The model that was requested (e.g., “gpt-4”)
The model that was actually used (e.g., “gpt-4-0613”)
finish_reason()
Set the reason why the LLM stopped generating:
s.finish_reason(reason: str) -> Step
Finish reason: “stop”, “length”, “tool_calls”, “content_filter”, etc.
tokens()
Set token usage information:
s.tokens(
prompt_uncached: Optional[int] = None,
prompt_cached: Optional[int] = None,
completion: Optional[int] = None
) -> Step
Number of uncached prompt tokens
Number of cached prompt tokens
Number of completion tokens generated
cost()
Set the actual cost of this LLM call:
s.cost(real_total: float) -> Step
Total cost in dollars (e.g., 0.002 for $0.002)
Set the tool/function definitions provided to the LLM:
s.tool_definitions(definitions: str) -> Step
JSON string or text representation of tool definitions
status()
Set the status code and optional message:
s.status(
code: int,
message: Optional[str] = None
) -> Step
Status code: 0 = success, 1 = partial success, 2 = error
Human-readable status message
Create a child tool call:
s.tool_call(
tool_name: Optional[str] = None,
tool_call_id: Optional[str] = None
) -> ToolCall
See tool_call() for full documentation.
end()
Finalize and send the step data:
You must call both s.prompt() and s.response() before calling s.end(), or a ValueError will be raised.
error()
Mark the step as failed and send immediately:
s.error(status_message: str = "") -> None
Error message describing what went wrong
Usage Examples
Creating Steps from a Run
from contextcompany import run
r = run()
# Create a step from the run object (recommended)
s = r.step()
s.prompt("Summarize this text...")
s.response("Here is a summary...")
s.end()
r.prompt(user_prompt="User query")
r.response("Final response")
r.end()
Creating Steps Independently
from contextcompany import run, step
r = run()
# Create a step using the run_id
s = step(run_id=r.run_id)
s.prompt("Analyze this data...")
s.response("Based on the analysis...")
s.end()
r.prompt(user_prompt="User query")
r.response("Final response")
r.end()
Complete Step with All Fields
from contextcompany import run
import json
r = run()
s = r.step()
# Set prompt and response
s.prompt(json.dumps([
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is 2+2?"}
]))
s.response("The answer is 4.")
# Set model information
s.model(requested="gpt-4", used="gpt-4-0613")
# Set token usage
s.tokens(
prompt_uncached=25,
prompt_cached=0,
completion=10
)
# Set cost
s.cost(0.001)
# Set finish reason
s.finish_reason("stop")
s.end()
r.prompt(user_prompt="What is 2+2?")
r.response("The answer is 4.")
r.end()
from contextcompany import run
import json
r = run()
s = r.step()
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}
]
s.prompt("What's the weather in SF?")
s.tool_definitions(json.dumps(tools))
s.response("I'll check the weather for you.")
s.finish_reason("tool_calls")
s.end()
r.prompt(user_prompt="What's the weather?")
r.response("It's sunny!")
r.end()
Multiple Steps in Sequence
from contextcompany import run
r = run()
# First step: planning
s1 = r.step()
s1.prompt("Plan how to solve this problem...")
s1.response("I will first analyze the data, then...")
s1.model(requested="gpt-4", used="gpt-4")
s1.tokens(prompt_uncached=50, completion=30)
s1.end()
# Second step: execution
s2 = r.step()
s2.prompt("Execute the plan...")
s2.response("Analysis complete. Results show...")
s2.model(requested="gpt-4", used="gpt-4")
s2.tokens(prompt_uncached=100, completion=75)
s2.end()
# Finalize run
r.prompt(user_prompt="Solve this problem")
r.response("Problem solved successfully.")
r.end()
Error Handling
from contextcompany import run
r = run()
s = r.step()
s.prompt("Generate a summary...")
try:
response = call_llm()
s.response(response)
s.end()
except Exception as e:
s.error(f"LLM call failed: {str(e)}")
# Continue with run even if step failed
r.prompt(user_prompt="User query")
r.response("Unable to complete due to error.")
r.status(2, "Error in LLM call")
r.end()
Tracking Cached vs Uncached Tokens
from contextcompany import run
r = run()
s = r.step()
s.prompt("Reuse previous context...")
s.response("Using cached context...")
# Track cache usage
s.tokens(
prompt_uncached=10, # Only 10 new tokens
prompt_cached=500, # 500 tokens from cache
completion=50
)
# Significant cost savings from caching
s.cost(0.0005) # Much lower cost than without cache
s.end()
r.prompt(user_prompt="User query")
r.response("Response")
r.end()
Best Practices
-
Always set prompt and response: Both are required before calling
s.end().
-
Track token usage: Use
s.tokens() to monitor and optimize token consumption.
-
Record actual costs: Use
s.cost() to track real spending, not just estimated costs.
-
Use tool_definitions: When using function calling, record the tools provided to the LLM.
-
Handle errors gracefully: Use
s.error() to capture and report failures without breaking your agent.
-
Create steps from runs: Use
r.step() instead of step(run_id=...) for cleaner code.
See Also