Skip to main content

Overview

RubricGroup is a class for combining multiple Rubric instances into a single rubric. It aggregates reward functions and metrics from all child rubrics, allowing you to compose complex evaluation strategies from simpler components.

Class Signature

class RubricGroup(Rubric):
    def __init__(self, rubrics: list[Rubric], **kwargs)

Parameters

rubrics
list[Rubric]
required
List of rubric instances to aggregate. Must contain at least one rubric, otherwise a ValueError is raised.
**kwargs
Any
Additional keyword arguments passed to the parent Rubric class.

Attributes

rubrics
list[Rubric]
The list of child rubrics being aggregated.

Methods

add_reward_func

Adds a reward function to the first rubric in the group.
def add_reward_func(self, func: RewardFunc, weight: float = 1.0)
func
RewardFunc
required
The reward function to add. Should accept parameters like completion, answer, prompt, etc.
weight
float
default:"1.0"
The weight to apply to this reward function when calculating the total reward.
This method adds the reward function to the first rubric only, not all rubrics. A warning is logged when this method is called.

add_metric

Adds a metric (zero-weight reward function) to the first rubric in the group.
def add_metric(self, func: RewardFunc, weight: float = 0.0)
func
RewardFunc
required
The metric function to add. Should accept parameters like completion, answer, prompt, etc.
weight
float
default:"0.0"
The weight for this metric (typically 0.0 for tracking purposes only).
This method adds the metric to the first rubric only. A warning is logged when this method is called.

add_class_object

Adds a class object (like a parser) to the first rubric in the group.
def add_class_object(self, name: str, obj: Any)
name
str
required
The name to use when referencing this object in reward functions.
obj
Any
required
The object to add (e.g., a parser, validator, or other helper class).
This method adds the object to the first rubric only. A warning is logged when this method is called.

score_rollout

Evaluates all reward functions for a single rollout.
async def score_rollout(self, state: State)
state
State
required
The rollout state to score. This is modified in-place with aggregated rewards and metrics.
Behavior:
  1. Iterates through each child rubric
  2. Calls score_rollout on each rubric
  3. Aggregates rewards (summed across rubrics)
  4. Aggregates metrics (summed across rubrics)
  5. Updates the state with total reward and combined metrics
State Modifications:
  • state["reward"]: Set to the sum of all rubric rewards
  • state["metrics"]: Set to the combined metrics from all rubrics

score_group

Evaluates all reward functions for a group of rollouts.
async def score_group(self, states: list[State])
states
list[State]
required
List of rollout states to score. Each state is modified in-place with aggregated rewards and metrics.
Behavior:
  1. Iterates through each child rubric
  2. Calls score_group on each rubric
  3. Aggregates rewards across all rubrics for each state
  4. Aggregates metrics across all rubrics for each state
  5. Updates each state with total reward and combined metrics

Internal Methods

These methods aggregate information from child rubrics:
  • _get_reward_func_names(): Returns all reward function names from all rubrics
  • _get_reward_funcs(): Returns all reward functions from all rubrics
  • _get_reward_weights(): Returns all reward weights from all rubrics

Usage Examples

Basic Usage

import verifiers as vf

def correctness_check(completion, answer, **kwargs):
    return 1.0 if completion == answer else 0.0

def length_penalty(completion, **kwargs):
    # Penalize very long completions
    return -0.1 if len(completion) > 1000 else 0.0

# Create separate rubrics
correctness_rubric = vf.Rubric(
    funcs=[correctness_check],
    weights=[1.0]
)

quality_rubric = vf.Rubric(
    funcs=[length_penalty],
    weights=[0.5]
)

# Combine into a group
rubric_group = vf.RubricGroup(
    rubrics=[correctness_rubric, quality_rubric]
)

# Use in an environment
env = vf.SingleTurnEnv(
    dataset=my_dataset,
    rubric=rubric_group
)

Combining Domain-Specific Rubrics

import verifiers as vf

# Math-specific evaluation
math_rubric = vf.MathRubric(
    extract_answer=True,
    normalize=True
)

# Format validation
def check_json_format(completion, **kwargs):
    try:
        json.loads(completion)
        return 1.0
    except:
        return 0.0

format_rubric = vf.Rubric(
    funcs=[check_json_format],
    weights=[0.2]
)

# Combine both
combined_rubric = vf.RubricGroup(
    rubrics=[math_rubric, format_rubric]
)

Scoring a Single Rollout

import verifiers as vf
from verifiers.types import State

# Create rubrics
rubric1 = vf.Rubric(funcs=[lambda completion, **kw: 1.0], weights=[1.0])
rubric2 = vf.Rubric(funcs=[lambda completion, **kw: 0.5], weights=[0.8])

group = vf.RubricGroup(rubrics=[rubric1, rubric2])

# Create a state
state: State = {
    "prompt": [{"role": "user", "content": "What is 2+2?"}],
    "completion": [{"role": "assistant", "content": "4"}],
    "task": "math",
    "timing": {"generation_ms": 100, "total_ms": 100, "scoring_ms": 0},
    "trajectory": [],
    "responses": [],
    "turn": 0
}

# Score the rollout
await group.score_rollout(state)

print(state["reward"])   # 1.4 (1.0 * 1.0 + 0.5 * 0.8)
print(state["metrics"])  # Combined metrics from both rubrics

Scoring Multiple Rollouts

import verifiers as vf

def reward_func1(completion, **kwargs):
    return 1.0

def reward_func2(completion, **kwargs):
    return 0.5

rubric1 = vf.Rubric(funcs=[reward_func1], weights=[1.0])
rubric2 = vf.Rubric(funcs=[reward_func2], weights=[0.8])

group = vf.RubricGroup(rubrics=[rubric1, rubric2])

# Create multiple states
states = [create_state() for _ in range(10)]

# Score all states together
await group.score_group(states)

# Each state now has aggregated rewards
for state in states:
    print(f"Reward: {state['reward']}, Metrics: {state['metrics']}")

Adding Reward Functions

import verifiers as vf

rubric1 = vf.Rubric(funcs=[], weights=[])
rubric2 = vf.Rubric(funcs=[], weights=[])

group = vf.RubricGroup(rubrics=[rubric1, rubric2])

# Add a reward function (goes to first rubric)
def new_reward(completion, **kwargs):
    return 1.0 if "correct" in completion else 0.0

group.add_reward_func(new_reward, weight=0.5)
# This adds new_reward to rubric1 only

# Add a metric (goes to first rubric)
def token_count(completion, **kwargs):
    return len(completion.split())

group.add_metric(token_count, weight=0.0)
# This adds token_count to rubric1 only

Reward and Metric Aggregation

How Rewards are Aggregated

Rewards from all rubrics are summed together:
total_reward = sum(rubric.reward for rubric in rubrics)

How Metrics are Aggregated

Metrics with the same name are summed across rubrics:
# If rubric1 has {"accuracy": 0.8} and rubric2 has {"accuracy": 0.2}
# The combined metrics will be {"accuracy": 1.0}

Example

import verifiers as vf

def func1(completion, **kwargs):
    return 2.0

def func2(completion, **kwargs):
    return 3.0

rubric1 = vf.Rubric(funcs=[func1], weights=[1.0])
rubric2 = vf.Rubric(funcs=[func2], weights=[0.5])

group = vf.RubricGroup(rubrics=[rubric1, rubric2])

state = create_state()
await group.score_rollout(state)

print(state["reward"])  # 3.5 (2.0 * 1.0 + 3.0 * 0.5)
print(state["metrics"]) # {"func1": 2.0, "func2": 3.0}

When to Use RubricGroup

Use RubricGroup when:
  • You want to combine multiple evaluation criteria from different rubrics
  • You have domain-specific rubrics that should be evaluated together
  • You need to compose complex evaluation strategies from simpler components
  • You want to weight different aspects of evaluation differently
Common scenarios:
  • Combining a MathRubric with custom format validation
  • Aggregating task-specific rubrics with general quality metrics
  • Creating modular evaluation pipelines
  • Reusing rubrics across different environments

Important Notes

Empty rubrics list: A ValueError is raised if you try to create a RubricGroup with an empty list of rubrics.
Methods that modify rubrics: add_reward_func, add_metric, and add_class_object only affect the first rubric in the group. If you need to add functions to specific rubrics, access them directly via rubric_group.rubrics[index].
Inheritance: RubricGroup inherits from Rubric, so it can be used anywhere a Rubric is expected (e.g., in environments).

See Also

Build docs developers (and LLMs) love