RubricGroup

Overview

RubricGroup is a class for combining multiple Rubric instances into a single rubric. It aggregates reward functions and metrics from all child rubrics, allowing you to compose complex evaluation strategies from simpler components.

Class Signature

class RubricGroup(Rubric):
    def __init__(self, rubrics: list[Rubric], **kwargs)

Parameters

rubrics

list[Rubric]

required

List of rubric instances to aggregate. Must contain at least one rubric, otherwise a ValueError is raised.

**kwargs

Any

Additional keyword arguments passed to the parent Rubric class.

Attributes

rubrics

list[Rubric]

The list of child rubrics being aggregated.

Methods

add_reward_func

Adds a reward function to the first rubric in the group.

def add_reward_func(self, func: RewardFunc, weight: float = 1.0)

func

RewardFunc

required

The reward function to add. Should accept parameters like completion, answer, prompt, etc.

weight

float

default:"1.0"

The weight to apply to this reward function when calculating the total reward.

This method adds the reward function to the first rubric only, not all rubrics. A warning is logged when this method is called.

add_metric

Adds a metric (zero-weight reward function) to the first rubric in the group.

def add_metric(self, func: RewardFunc, weight: float = 0.0)

func

RewardFunc

required

The metric function to add. Should accept parameters like completion, answer, prompt, etc.

weight

float

default:"0.0"

The weight for this metric (typically 0.0 for tracking purposes only).

This method adds the metric to the first rubric only. A warning is logged when this method is called.

add_class_object

Adds a class object (like a parser) to the first rubric in the group.

def add_class_object(self, name: str, obj: Any)

name

str

required

The name to use when referencing this object in reward functions.

obj

Any

required

The object to add (e.g., a parser, validator, or other helper class).

This method adds the object to the first rubric only. A warning is logged when this method is called.

score_rollout

Evaluates all reward functions for a single rollout.

async def score_rollout(self, state: State)

state

State

required

The rollout state to score. This is modified in-place with aggregated rewards and metrics.

Behavior:

Iterates through each child rubric
Calls score_rollout on each rubric
Aggregates rewards (summed across rubrics)
Aggregates metrics (summed across rubrics)
Updates the state with total reward and combined metrics

State Modifications:

state["reward"]: Set to the sum of all rubric rewards
state["metrics"]: Set to the combined metrics from all rubrics

score_group

Evaluates all reward functions for a group of rollouts.

async def score_group(self, states: list[State])

states

list[State]

required

List of rollout states to score. Each state is modified in-place with aggregated rewards and metrics.

Behavior:

Iterates through each child rubric
Calls score_group on each rubric
Aggregates rewards across all rubrics for each state
Aggregates metrics across all rubrics for each state
Updates each state with total reward and combined metrics

Internal Methods

These methods aggregate information from child rubrics:

_get_reward_func_names(): Returns all reward function names from all rubrics
_get_reward_funcs(): Returns all reward functions from all rubrics
_get_reward_weights(): Returns all reward weights from all rubrics

Usage Examples

Basic Usage

import verifiers as vf

def correctness_check(completion, answer, **kwargs):
    return 1.0 if completion == answer else 0.0

def length_penalty(completion, **kwargs):
    # Penalize very long completions
    return -0.1 if len(completion) > 1000 else 0.0

# Create separate rubrics
correctness_rubric = vf.Rubric(
    funcs=[correctness_check],
    weights=[1.0]
)

quality_rubric = vf.Rubric(
    funcs=[length_penalty],
    weights=[0.5]
)

# Combine into a group
rubric_group = vf.RubricGroup(
    rubrics=[correctness_rubric, quality_rubric]
)

# Use in an environment
env = vf.SingleTurnEnv(
    dataset=my_dataset,
    rubric=rubric_group
)

Combining Domain-Specific Rubrics

import verifiers as vf

# Math-specific evaluation
math_rubric = vf.MathRubric(
    extract_answer=True,
    normalize=True
)

# Format validation
def check_json_format(completion, **kwargs):
    try:
        json.loads(completion)
        return 1.0
    except:
        return 0.0

format_rubric = vf.Rubric(
    funcs=[check_json_format],
    weights=[0.2]
)

# Combine both
combined_rubric = vf.RubricGroup(
    rubrics=[math_rubric, format_rubric]
)

Scoring a Single Rollout

import verifiers as vf
from verifiers.types import State

# Create rubrics
rubric1 = vf.Rubric(funcs=[lambda completion, **kw: 1.0], weights=[1.0])
rubric2 = vf.Rubric(funcs=[lambda completion, **kw: 0.5], weights=[0.8])

group = vf.RubricGroup(rubrics=[rubric1, rubric2])

# Create a state
state: State = {
    "prompt": [{"role": "user", "content": "What is 2+2?"}],
    "completion": [{"role": "assistant", "content": "4"}],
    "task": "math",
    "timing": {"generation_ms": 100, "total_ms": 100, "scoring_ms": 0},
    "trajectory": [],
    "responses": [],
    "turn": 0
}

# Score the rollout
await group.score_rollout(state)

print(state["reward"])   # 1.4 (1.0 * 1.0 + 0.5 * 0.8)
print(state["metrics"])  # Combined metrics from both rubrics

Scoring Multiple Rollouts

import verifiers as vf

def reward_func1(completion, **kwargs):
    return 1.0

def reward_func2(completion, **kwargs):
    return 0.5

rubric1 = vf.Rubric(funcs=[reward_func1], weights=[1.0])
rubric2 = vf.Rubric(funcs=[reward_func2], weights=[0.8])

group = vf.RubricGroup(rubrics=[rubric1, rubric2])

# Create multiple states
states = [create_state() for _ in range(10)]

# Score all states together
await group.score_group(states)

# Each state now has aggregated rewards
for state in states:
    print(f"Reward: {state['reward']}, Metrics: {state['metrics']}")

Adding Reward Functions

import verifiers as vf

rubric1 = vf.Rubric(funcs=[], weights=[])
rubric2 = vf.Rubric(funcs=[], weights=[])

group = vf.RubricGroup(rubrics=[rubric1, rubric2])

# Add a reward function (goes to first rubric)
def new_reward(completion, **kwargs):
    return 1.0 if "correct" in completion else 0.0

group.add_reward_func(new_reward, weight=0.5)
# This adds new_reward to rubric1 only

# Add a metric (goes to first rubric)
def token_count(completion, **kwargs):
    return len(completion.split())

group.add_metric(token_count, weight=0.0)
# This adds token_count to rubric1 only

Reward and Metric Aggregation

How Rewards are Aggregated

Rewards from all rubrics are summed together:

total_reward = sum(rubric.reward for rubric in rubrics)

How Metrics are Aggregated

Metrics with the same name are summed across rubrics:

# If rubric1 has {"accuracy": 0.8} and rubric2 has {"accuracy": 0.2}
# The combined metrics will be {"accuracy": 1.0}

Example

import verifiers as vf

def func1(completion, **kwargs):
    return 2.0

def func2(completion, **kwargs):
    return 3.0

rubric1 = vf.Rubric(funcs=[func1], weights=[1.0])
rubric2 = vf.Rubric(funcs=[func2], weights=[0.5])

group = vf.RubricGroup(rubrics=[rubric1, rubric2])

state = create_state()
await group.score_rollout(state)

print(state["reward"])  # 3.5 (2.0 * 1.0 + 3.0 * 0.5)
print(state["metrics"]) # {"func1": 2.0, "func2": 3.0}

When to Use RubricGroup

Use RubricGroup when:

You want to combine multiple evaluation criteria from different rubrics
You have domain-specific rubrics that should be evaluated together
You need to compose complex evaluation strategies from simpler components
You want to weight different aspects of evaluation differently

Common scenarios:

Combining a MathRubric with custom format validation
Aggregating task-specific rubrics with general quality metrics
Creating modular evaluation pipelines
Reusing rubrics across different environments

Important Notes

Empty rubrics list: A ValueError is raised if you try to create a RubricGroup with an empty list of rubrics.

Methods that modify rubrics: add_reward_func, add_metric, and add_class_object only affect the first rubric in the group. If you need to add functions to specific rubrics, access them directly via rubric_group.rubrics[index].

Inheritance: RubricGroup inherits from Rubric, so it can be used anywhere a Rubric is expected (e.g., in environments).

Environment Classes

Rubrics & Parsers

Clients

Integration Classes

Experimental

Data Types

Utilities

Overview

Class Signature

Parameters

Attributes

Methods

add_reward_func

add_metric

add_class_object

score_rollout

score_group

Internal Methods

Usage Examples

Basic Usage

Combining Domain-Specific Rubrics

Scoring a Single Rollout

Scoring Multiple Rollouts

Adding Reward Functions

Reward and Metric Aggregation

How Rewards are Aggregated

How Metrics are Aggregated

Example

When to Use RubricGroup

Important Notes

See Also

Build docs developers (and LLMs) love

Environment Classes

Rubrics & Parsers

Clients

Integration Classes

Experimental

Data Types

Utilities

​Overview

​Class Signature

​Parameters

​Attributes

​Methods

​add_reward_func

​add_metric

​add_class_object

​score_rollout

​score_group

​Internal Methods

​Usage Examples

​Basic Usage

​Combining Domain-Specific Rubrics

​Scoring a Single Rollout

​Scoring Multiple Rollouts

​Adding Reward Functions

​Reward and Metric Aggregation

​How Rewards are Aggregated

​How Metrics are Aggregated

​Example

​When to Use RubricGroup

​Important Notes

​See Also

Build docs developers (and LLMs) love

Overview

Class Signature

Parameters

Attributes

Methods

add_reward_func

add_metric

add_class_object

score_rollout

score_group

Internal Methods

Usage Examples

Basic Usage

Combining Domain-Specific Rubrics

Scoring a Single Rollout

Scoring Multiple Rollouts

Adding Reward Functions

Reward and Metric Aggregation

How Rewards are Aggregated

How Metrics are Aggregated

Example

When to Use RubricGroup

Important Notes

See Also