Skip to main content

Endpoint

method
string
default:"POST"
POST
endpoint
string
/v1/evaluator/

Authentication

This endpoint requires API key authentication. Include your API key in the request headers:
Authorization: Bearer YOUR_API_KEY

Request Body

name
string
required
Name for the evaluator
scoring_type
string
required
Type of scoring. Options:
  • LLM-BOOLEAN: Binary true/false evaluation
  • LLM-CHOICE: Multiple choice selection
  • LLM-RANGE: Numeric range scoring
  • PYTHON: Custom Python code evaluation
  • LASTMILE: LastMile framework integration
llm_template
object
LLM configuration (required for LLM-based scoring types)
model
string
Model to use for evaluation (e.g., “gpt-4”)
prompt
string
Evaluation prompt template
min
number
Minimum score (for LLM-RANGE)
max
number
Maximum score (for LLM-RANGE)
choices
array
Available choices (for LLM-CHOICE)
code_template
object
Python code configuration (required for PYTHON scoring type)
code
string
Python code that evaluates the response
last_mile_config
object
LastMile configuration (required for LASTMILE scoring type)

Response

data
object
Created evaluator details
id
string
Unique identifier for the evaluator
name
string
Evaluator name
scoring_type
string
Type of scoring
llm_template
object
LLM configuration
code_template
object
Code configuration
last_mile_config
object
LastMile configuration
organization_id
string
Organization ID
created_at
string
Creation timestamp
updated_at
string
Last update timestamp
error
string | null
Error message if the request failed, null otherwise

Example Request - LLM Boolean Evaluator

curl -X POST https://api.helicone.ai/v1/evaluator/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Hallucination Detector",
    "scoring_type": "LLM-BOOLEAN",
    "llm_template": {
      "model": "gpt-4",
      "prompt": "Analyze if the response contains hallucinations or false information based on the context provided. Return true if hallucinations are detected, false otherwise."
    }
  }'

Example Request - LLM Range Evaluator

curl -X POST https://api.helicone.ai/v1/evaluator/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Relevance Score",
    "scoring_type": "LLM-RANGE",
    "llm_template": {
      "model": "gpt-4",
      "prompt": "Rate how relevant this response is to the user question on a scale of 1-10, where 1 is completely irrelevant and 10 is perfectly relevant.",
      "min": 1,
      "max": 10
    }
  }'

Example Request - Python Evaluator

curl -X POST https://api.helicone.ai/v1/evaluator/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Response Length Check",
    "scoring_type": "PYTHON",
    "code_template": {
      "code": "def evaluate(input, output):\n    word_count = len(output.split())\n    return 1.0 if word_count >= 50 else 0.0"
    }
  }'

Example Response

{
  "data": {
    "id": "eval_abc123",
    "name": "Hallucination Detector",
    "scoring_type": "LLM-BOOLEAN",
    "llm_template": {
      "model": "gpt-4",
      "prompt": "Analyze if the response contains hallucinations..."
    },
    "code_template": null,
    "last_mile_config": null,
    "organization_id": "org_xyz789",
    "created_at": "2024-01-15T10:30:00Z",
    "updated_at": "2024-01-15T10:30:00Z"
  },
  "error": null
}

Notes

  • LLM evaluators use AI models to score responses automatically
  • Python evaluators give you full control with custom code
  • LastMile evaluators integrate with the LastMile evaluation framework
  • You can test evaluators before creating them using the /v1/evaluator/llm/test or /v1/evaluator/python/test endpoints

Build docs developers (and LLMs) love