Skip to main content

Endpoint

method
string
default:"GET"
GET
endpoint
string
/v1/evaluator/{evaluatorId}/stats

Authentication

This endpoint requires API key authentication. Include your API key in the request headers:
Authorization: Bearer YOUR_API_KEY

Path Parameters

evaluatorId
string
required
The unique identifier of the evaluator

Response

data
object
Evaluator statistics
averageScore
number
Average score across all evaluations
totalUses
number
Total number of times this evaluator has been used
recentTrend
string
Recent trend direction: “up”, “down”, or “stable”
scoreDistribution
array
Distribution of scores across ranges
range
string
Score range (e.g., “0-20”, “20-40”)
count
number
Number of evaluations in this range
timeSeriesData
array
Historical data over time
date
string
Date in ISO format
value
number
Average score for that date
error
string | null
Error message if the request failed, null otherwise

Example Request

curl -X GET https://api.helicone.ai/v1/evaluator/eval_abc123/stats \
  -H "Authorization: Bearer YOUR_API_KEY"

Example Response

{
  "data": {
    "averageScore": 7.5,
    "totalUses": 1250,
    "recentTrend": "up",
    "scoreDistribution": [
      {
        "range": "0-2",
        "count": 50
      },
      {
        "range": "2-4",
        "count": 100
      },
      {
        "range": "4-6",
        "count": 200
      },
      {
        "range": "6-8",
        "count": 450
      },
      {
        "range": "8-10",
        "count": 450
      }
    ],
    "timeSeriesData": [
      {
        "date": "2024-01-01",
        "value": 7.2
      },
      {
        "date": "2024-01-02",
        "value": 7.3
      },
      {
        "date": "2024-01-03",
        "value": 7.4
      },
      {
        "date": "2024-01-04",
        "value": 7.6
      },
      {
        "date": "2024-01-05",
        "value": 7.5
      }
    ]
  },
  "error": null
}

Use Cases

  • Monitor evaluator performance over time
  • Identify trends in evaluation scores
  • Understand score distribution patterns
  • Track usage frequency of different evaluators
  • Optimize evaluation criteria based on historical data

Build docs developers (and LLMs) love