PassRate reports the proportion of evaluated examples that score at or above a
defined quality threshold. Use it to answer “what percentage of responses were
good enough?”
Constructor parameters
Minimum score required for an example to count as a pass.
An example passes when
score >= threshold.Name of the score key to evaluate. Must match a key emitted by an evaluator —
for example
"f1", "exact_match", or "mc_accuracy".Return value
compute() returns a dict[str, float] with the following key:
Fraction of examples where
score_field >= threshold. Range [0.0, 1.0].
Returns 0.0 when the row list is empty.Usage
When it is enabled
PassRate is included in every CLI run. The --threshold flag controls the
threshold value (default: 0.7) and --score-field controls which field is
read (default: f1).
