Scorers API
Scorers evaluate the quality of LLM outputs. All scorers implement theScorer type.
Import
Types
Scorer
ScorerArgs
ScorerResult
Deterministic Scorers
exactMatch
Strict string equality:
1.0ifoutput === String(expected)0.0otherwise, with areasonexplaining the mismatch
includes
Substring check:
1.0ifoutput.includes(String(expected))0.0otherwise
regex(pattern)
Regular expression test:
1.0ifpattern.test(output)0.0otherwise
levenshtein
Normalized edit distance similarity:
1.0for exact match0.0for completely different strings- Decimal between 0 and 1 for partial similarity
- Includes
reasonandmetadatafrom autoevals
jsonMatch
Deep structural equality for JSON:
1.0if JSON structures are deeply equal0.0if structures differ or JSON is invalid
- Object key order doesn’t matter
- Array order matters
expectedcan be a string or an object
LLM-Based Scorers
factuality(config)
Checks if output is factually correct:
1.0if output is factually consistent with expected0.0if output contradicts expected- Decimal between 0 and 1 for partial correctness
reasonfield contains LLM’s explanationmetadataincludes additional details from autoevals
OPENAI_API_KEYenvironment variable- OpenAI-compatible API endpoint
Combinators
all(...scorers)
Weakest-link (minimum score):
score: Minimum score of all scorersreason: Concatenated reasons from all scorers (semicolon-separated)
any(...scorers)
Best-of (maximum score):
score: Maximum score of all scorersreason: Reason from the highest-scoring scorer
weighted(config)
Weighted average:
score: Weighted averagesum(score * weight) / sum(weight)reason: Lists all scorer scores and weights
Custom Scorers
Create custom scorers by implementing theScorer type:
- Return a
Promise<ScorerResult> - Score must be between 0 and 1
- Optionally include
reasonandmetadata
Examples
Using Multiple Scorers
>= threshold.
Combining Scorers
Custom Scorer
Next Steps
Evaluate API
Learn about the evaluate() function
Scorers Guide
Scorer usage guide