State
The State type is a specialized dict subclass that holds all runtime information for a rollout, including inputs, outputs, trajectory, and metadata.
Overview
State provides a unified interface for accessing both:
- Input fields: From the dataset row (
prompt, answer, task, info, example_id)
- Runtime fields: Created during rollout execution (
trajectory, reward, completion, etc.)
Fields are automatically forwarded to/from the nested input dict for seamless access.
Type Definition
class State(dict):
INPUT_FIELDS = ["prompt", "answer", "task", "info", "example_id"]
# Input fields (from dataset)
input: RolloutInput
client: Client
model: str
sampling_args: SamplingArgs | None
# Created during rollout
is_completed: bool
is_truncated: bool
stop_condition: str | None
tool_defs: list[Tool]
trajectory: list[TrajectoryStep]
completion: Messages | None
reward: float | None
advantage: float | None
metrics: dict[str, float] | None
timing: RolloutTiming | None
error: Error | None
usage: TokenUsage | None
usage_tracker: object
These fields come from the dataset and are stored in state["input"]:
The input prompt as a list of messages.
Ground truth answer or reference data for scoring.
Task identifier (e.g., “math”, “coding”, “gsm8k”).
Additional metadata from the dataset (arbitrary dict).
Unique integer ID for this example.
Runtime Fields
The API client instance for model calls.
Model identifier (e.g., “gpt-4”, “claude-3-5-sonnet-20241022”).
Sampling parameters (temperature, top_p, etc.).
Whether the rollout completed successfully.
Whether the rollout was truncated (max turns, length limit, etc.).
Name of the stop condition that ended the rollout.
Tool definitions available during this rollout.
Complete turn-by-turn trajectory (prompts, completions, rewards).
Final completion (last assistant message or concatenated messages).
Total reward for this rollout.
Advantage value (for group scoring).
Named metric scores (e.g., {"correctness": 1.0, "length": 0.8}).
Timing information (start_time, generation_ms, scoring_ms, total_ms).
Error object if rollout failed.
Token usage (input_tokens, output_tokens).
Special Behavior
Accessing input fields automatically looks in state["input"]:
state = State({
"input": {
"prompt": [{"role": "user", "content": "Hello"}],
"answer": "42",
"task": "qa",
"example_id": 0
}
})
# These are equivalent:
state["prompt"] # Returns the prompt
state["input"]["prompt"] # Same result
# Setting also forwards:
state["answer"] = "43"
assert state["input"]["answer"] == "43"
get() Method
def get(self, key: str, default: Any = None) -> Any
Safe access with default fallback:
reward = state.get("reward", 0.0) # Returns 0.0 if not set
task = state.get("task") # Returns None if not set
Example Usage
Basic Access
import verifiers as vf
# In a reward function
def reward_fn(state: vf.State) -> float:
# Access input fields
answer = state["answer"]
task = state["task"]
# Access runtime fields
completion = state["completion"]
trajectory = state["trajectory"]
# Check completion
if not state["is_completed"]:
return 0.0
# Compute reward
return 1.0 if answer in str(completion) else 0.0
Trajectory Inspection
def analyze_trajectory(state: vf.State) -> dict:
"""Extract statistics from trajectory."""
trajectory = state["trajectory"]
return {
"num_turns": len(trajectory),
"total_tokens": sum(
step["tokens"]["completion_ids"].__len__()
for step in trajectory
if step.get("tokens")
),
"tool_calls": sum(
1 for step in trajectory
if step["response"]["message"].get("tool_calls")
),
}
Custom State Keys
Environments can add custom keys:
class CustomEnv(vf.MultiTurnEnv):
async def setup_state(self, state: vf.State) -> vf.State:
state = await super().setup_state(state)
# Add custom fields
state["custom_data"] = {"foo": "bar"}
state["attempt_count"] = 0
return state
async def env_response(
self,
messages: vf.Messages,
state: vf.State,
**kwargs
) -> vf.Messages:
# Access custom fields
state["attempt_count"] += 1
if state["attempt_count"] > 3:
return [{"role": "user", "content": "Too many attempts!"}]
return [{"role": "user", "content": "Try again"}]
Safe Error Access
def handle_errors(state: vf.State):
error = state.get("error")
if error:
print(f"Error type: {type(error).__name__}")
print(f"Error message: {str(error)}")
# Check error type
if isinstance(error, vf.SandboxError):
print("Sandbox operation failed")
elif isinstance(error, vf.InfraError):
print("Infrastructure error")
else:
print("No errors")
Serialization
State can be converted to RolloutOutput for serialization:
# During environment.generate()
output: vf.RolloutOutput = serialize_state(state)
# RolloutOutput is JSON-serializable:
import json
json.dumps(output) # Works
Type Annotations
from verifiers.types import State, Messages
def my_reward(state: State) -> float:
# Type checker knows State has these fields
prompt: Messages = state["prompt"]
reward: float | None = state.get("reward")
return reward or 0.0
Common Patterns
Checking Completion
if state["is_completed"]:
# Rollout finished successfully
reward = state["reward"]
else:
# Rollout was interrupted
error = state.get("error")
info = state.get("info", {})
custom_field = info.get("custom_field", "default")
Iterating Trajectory
for i, step in enumerate(state["trajectory"]):
print(f"Turn {i}:")
print(f" Prompt: {step['prompt']}")
print(f" Completion: {step['completion']}")
print(f" Reward: {step.get('reward', 0.0)}")
See Also