State

The State type is a specialized dict subclass that holds all runtime information for a rollout, including inputs, outputs, trajectory, and metadata.

Overview

State provides a unified interface for accessing both:

Input fields: From the dataset row (prompt, answer, task, info, example_id)
Runtime fields: Created during rollout execution (trajectory, reward, completion, etc.)

Fields are automatically forwarded to/from the nested input dict for seamless access.

Type Definition

class State(dict):
    INPUT_FIELDS = ["prompt", "answer", "task", "info", "example_id"]
    
    # Input fields (from dataset)
    input: RolloutInput
    client: Client
    model: str
    sampling_args: SamplingArgs | None
    
    # Created during rollout
    is_completed: bool
    is_truncated: bool
    stop_condition: str | None
    tool_defs: list[Tool]
    trajectory: list[TrajectoryStep]
    completion: Messages | None
    reward: float | None
    advantage: float | None
    metrics: dict[str, float] | None
    timing: RolloutTiming | None
    error: Error | None
    usage: TokenUsage | None
    usage_tracker: object

Input Fields

These fields come from the dataset and are stored in state["input"]:

prompt

Messages

The input prompt as a list of messages.

answer

str | Any

Ground truth answer or reference data for scoring.

task

str

Task identifier (e.g., “math”, “coding”, “gsm8k”).

info

Info

Additional metadata from the dataset (arbitrary dict).

example_id

int

Unique integer ID for this example.

Runtime Fields

client

Client

The API client instance for model calls.

model

str

Model identifier (e.g., “gpt-4”, “claude-3-5-sonnet-20241022”).

sampling_args

SamplingArgs | None

Sampling parameters (temperature, top_p, etc.).

is_completed

bool

Whether the rollout completed successfully.

is_truncated

bool

Whether the rollout was truncated (max turns, length limit, etc.).

stop_condition

str | None

Name of the stop condition that ended the rollout.

tool_defs

list[Tool]

Tool definitions available during this rollout.

trajectory

list[TrajectoryStep]

Complete turn-by-turn trajectory (prompts, completions, rewards).

completion

Messages | None

Final completion (last assistant message or concatenated messages).

reward

float | None

Total reward for this rollout.

advantage

float | None

Advantage value (for group scoring).

metrics

dict[str, float] | None

Named metric scores (e.g., {"correctness": 1.0, "length": 0.8}).

timing

RolloutTiming | None

Timing information (start_time, generation_ms, scoring_ms, total_ms).

error

Error | None

Error object if rollout failed.

usage

TokenUsage | None

Token usage (input_tokens, output_tokens).

Special Behavior

Input Field Forwarding

Accessing input fields automatically looks in state["input"]:

state = State({
    "input": {
        "prompt": [{"role": "user", "content": "Hello"}],
        "answer": "42",
        "task": "qa",
        "example_id": 0
    }
})

# These are equivalent:
state["prompt"]          # Returns the prompt
state["input"]["prompt"] # Same result

# Setting also forwards:
state["answer"] = "43"
assert state["input"]["answer"] == "43"

get() Method

def get(self, key: str, default: Any = None) -> Any

Safe access with default fallback:

reward = state.get("reward", 0.0)  # Returns 0.0 if not set
task = state.get("task")            # Returns None if not set

Example Usage

Basic Access

import verifiers as vf

# In a reward function
def reward_fn(state: vf.State) -> float:
    # Access input fields
    answer = state["answer"]
    task = state["task"]
    
    # Access runtime fields
    completion = state["completion"]
    trajectory = state["trajectory"]
    
    # Check completion
    if not state["is_completed"]:
        return 0.0
    
    # Compute reward
    return 1.0 if answer in str(completion) else 0.0

Trajectory Inspection

def analyze_trajectory(state: vf.State) -> dict:
    """Extract statistics from trajectory."""
    trajectory = state["trajectory"]
    
    return {
        "num_turns": len(trajectory),
        "total_tokens": sum(
            step["tokens"]["completion_ids"].__len__()
            for step in trajectory
            if step.get("tokens")
        ),
        "tool_calls": sum(
            1 for step in trajectory
            if step["response"]["message"].get("tool_calls")
        ),
    }

Custom State Keys

Environments can add custom keys:

class CustomEnv(vf.MultiTurnEnv):
    async def setup_state(self, state: vf.State) -> vf.State:
        state = await super().setup_state(state)
        
        # Add custom fields
        state["custom_data"] = {"foo": "bar"}
        state["attempt_count"] = 0
        
        return state
    
    async def env_response(
        self,
        messages: vf.Messages,
        state: vf.State,
        **kwargs
    ) -> vf.Messages:
        # Access custom fields
        state["attempt_count"] += 1
        
        if state["attempt_count"] > 3:
            return [{"role": "user", "content": "Too many attempts!"}]
        
        return [{"role": "user", "content": "Try again"}]

Safe Error Access

def handle_errors(state: vf.State):
    error = state.get("error")
    
    if error:
        print(f"Error type: {type(error).__name__}")
        print(f"Error message: {str(error)}")
        
        # Check error type
        if isinstance(error, vf.SandboxError):
            print("Sandbox operation failed")
        elif isinstance(error, vf.InfraError):
            print("Infrastructure error")
    else:
        print("No errors")

Serialization

State can be converted to RolloutOutput for serialization:

# During environment.generate()
output: vf.RolloutOutput = serialize_state(state)

# RolloutOutput is JSON-serializable:
import json
json.dumps(output)  # Works

Type Annotations

from verifiers.types import State, Messages

def my_reward(state: State) -> float:
    # Type checker knows State has these fields
    prompt: Messages = state["prompt"]
    reward: float | None = state.get("reward")
    return reward or 0.0

Common Patterns

Checking Completion

if state["is_completed"]:
    # Rollout finished successfully
    reward = state["reward"]
else:
    # Rollout was interrupted
    error = state.get("error")

Accessing Metadata

info = state.get("info", {})
custom_field = info.get("custom_field", "default")

Iterating Trajectory

for i, step in enumerate(state["trajectory"]):
    print(f"Turn {i}:")
    print(f"  Prompt: {step['prompt']}")
    print(f"  Completion: {step['completion']}")
    print(f"  Reward: {step.get('reward', 0.0)}")

Environment Classes

Rubrics & Parsers

Clients

Integration Classes

Experimental

Data Types

Utilities

State

State

Overview

Type Definition

Input Fields

Runtime Fields

Special Behavior

Input Field Forwarding

get() Method

Example Usage

Basic Access

Trajectory Inspection

Custom State Keys

Safe Error Access

Serialization

Type Annotations

Common Patterns

Checking Completion

Accessing Metadata

Iterating Trajectory

See Also

Build docs developers (and LLMs) love

Environment Classes

Rubrics & Parsers

Clients

Integration Classes

Experimental

Data Types

Utilities

​State

​Overview

​Type Definition

​Input Fields

​Runtime Fields

​Special Behavior

​Input Field Forwarding

​get() Method

​Example Usage

​Basic Access

​Trajectory Inspection

​Custom State Keys

​Safe Error Access

​Serialization

​Type Annotations

​Common Patterns

​Checking Completion

​Accessing Metadata

​Iterating Trajectory

​See Also

Build docs developers (and LLMs) love

State

Overview

Type Definition

Input Fields

Runtime Fields

Special Behavior

Input Field Forwarding

get() Method

Example Usage

Basic Access

Trajectory Inspection

Custom State Keys

Safe Error Access

Serialization

Type Annotations

Common Patterns

Checking Completion

Accessing Metadata

Iterating Trajectory

See Also