Quick Start

This guide will walk you through setting up a Verifiers workspace, creating your first environment, and running evaluations.

Prerequisites

Before starting, ensure you have Python 3.10 or later installed.

Setup Your Workspace

Install uv and the Prime CLI

First, install uv (Python package manager) and the prime CLI tool:

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install the Prime CLI
uv tool install prime

# Log in to Prime Intellect
prime login

Initialize Your Workspace

Set up a new workspace for developing environments:

# Navigate to your development directory
cd ~/dev/my-lab

# Set up the workspace
prime lab setup

This command:

Creates a Python project (if needed)
Installs verifiers
Creates the recommended workspace structure
Downloads starter configuration files

Your workspace structure will look like:

configs/
├── endpoints.toml      # API endpoint configuration
├── rl/                 # Training configs
├── eval/               # Evaluation configs
└── gepa/               # Prompt optimization configs
environments/
└── AGENTS.md           # AI agent documentation

Add to Existing Project (Optional)

If you already have a Python project, add Verifiers without reinitializing:

uv add verifiers && prime lab setup --skip-install

Create Your First Environment

Initialize Environment Template

Create a new environment from the template:

prime env init my-env

This creates a new module in ./environments/my_env/ with:

environments/my_env/
├── my_env.py           # Main implementation
├── pyproject.toml      # Dependencies and metadata
└── README.md           # Documentation

Implement Your Environment

Edit environments/my_env/my_env.py with your environment logic:

import verifiers as vf
from datasets import Dataset

def load_environment(dataset_name: str = 'gsm8k') -> vf.Environment:
    # Load or create your dataset
    dataset = vf.load_example_dataset(dataset_name)
    
    # Define reward function
    async def correct_answer(completion, answer) -> float:
        completion_ans = completion[-1]['content']
        return 1.0 if completion_ans == answer else 0.0
    
    # Create rubric with reward functions
    rubric = vf.Rubric(funcs=[correct_answer])
    
    # Return environment instance
    env = vf.SingleTurnEnv(dataset=dataset, rubric=rubric)
    return env

The load_environment function is the entry point for your environment. It must return an Environment instance and can accept custom arguments.

Install Your Environment

Install the environment module into your project:

prime env install my-env

This makes your environment importable and runnable.

Run Your First Evaluation

Run Local Evaluation

Evaluate your environment with any OpenAI-compatible model:

prime eval run my-env -m gpt-5-nano

This will:

Load your environment
Run rollouts with the specified model
Calculate rewards and metrics
Save results locally

By default, evaluations use Prime Inference. Configure custom API endpoints in ./configs/endpoints.toml.

View Results

Open the terminal UI to explore your evaluation results:

prime eval tui

Navigate through:

Rollout samples
Reward distributions
Model completions
Metrics and statistics

Working with Existing Environments

Install from Environments Hub

Install any environment from the community hub:

prime env install primeintellect/math-python

Run Hub Environment

Evaluate it directly:

prime eval run primeintellect/math-python -m gpt-4.1-mini

Environment Types

Verifiers supports multiple environment patterns:

SingleTurnEnv

Simple Q&A tasks with a single model response

vf.SingleTurnEnv(dataset=dataset, rubric=rubric)

ToolEnv

Environments with stateless Python function tools

vf.ToolEnv(
    dataset=dataset,
    tools=[calculator, search],
    rubric=rubric
)

StatefulToolEnv

Tools requiring per-rollout state (sandboxes, sessions)

vf.StatefulToolEnv(
    dataset=dataset,
    tools=[file_ops],
    rubric=rubric
)

MultiTurnEnv

Custom multi-turn interactions, games, agents

class GameEnv(vf.MultiTurnEnv):
    async def env_response(self, messages, state):
        # Custom game logic
        pass

Building Complex Environments

Adding Tools

Create tool-enabled environments for agent tasks:

import verifiers as vf

def calculator(expression: str) -> str:
    """Evaluate a math expression."""
    try:
        result = eval(expression)
        return str(result)
    except Exception as e:
        return f"Error: {e}"

def load_environment():
    dataset = vf.load_example_dataset('gsm8k')
    
    async def correct_answer(completion, answer) -> float:
        response = completion[-1]['content']
        return 1.0 if answer in response else 0.0
    
    rubric = vf.Rubric(funcs=[correct_answer])
    
    return vf.ToolEnv(
        dataset=dataset,
        tools=[calculator],  # Pass Python functions
        rubric=rubric
    )

Using Sandboxes

For code execution tasks, use sandboxed environments:

import verifiers as vf

def load_environment():
    dataset = vf.load_example_dataset('codegen')
    
    async def code_passes_tests(state, info) -> float:
        # Check if code execution succeeded
        return 1.0 if state.get('tests_passed') else 0.0
    
    rubric = vf.Rubric(funcs=[code_passes_tests])
    
    return vf.PythonEnv(
        dataset=dataset,
        rubric=rubric
    )

Publishing Your Environment

Test Locally

Ensure your environment works correctly:

prime eval run my-env -n 10 -m gpt-4.1-mini

Push to Hub

Publish to the Environments Hub:

prime env push --path ./environments/my_env

Your environment is now available to the community!

Next Steps

Environments Guide

Learn about datasets, rubrics, and custom protocols

Evaluation Guide

Deep dive into evaluation configurations

Training Guide

Train models with reinforcement learning

API Reference

Explore the complete API documentation

Common Patterns

Load dataset from Hugging Face

from datasets import load_dataset

def load_environment():
    dataset = load_dataset('gsm8k', 'main', split='train')
    # ... configure environment

Multiple reward functions

async def accuracy(completion, answer) -> float:
    return 1.0 if answer in completion[-1]['content'] else 0.0

async def length_penalty(completion) -> float:
    length = len(completion[-1]['content'])
    return -0.01 * length  # Penalize longer responses

rubric = vf.Rubric(funcs=[accuracy, length_penalty])

Custom system prompts

return vf.SingleTurnEnv(
    dataset=dataset,
    system_prompt="You are a helpful math tutor. Show your work.",
    rubric=rubric
)

Using API keys

import verifiers as vf

def load_environment():
    # Validate required API keys
    vf.ensure_keys(['OPENAI_API_KEY', 'ANTHROPIC_API_KEY'])
    
    # ... rest of environment setup

Troubleshooting

Environment not found after install

Make sure you ran prime env install <env-name> and the environment has a valid load_environment function.

API endpoint errors

Configure your endpoints in ./configs/endpoints.toml. See the evaluation guide for details.

Import errors

Ensure all dependencies are listed in your environment’s pyproject.toml and installed.

Get Started

Core Concepts

Guides

Integrations

Prerequisites

Setup Your Workspace

Create Your First Environment

Run Your First Evaluation

Working with Existing Environments

Environment Types

SingleTurnEnv

ToolEnv

StatefulToolEnv

MultiTurnEnv

Building Complex Environments

Adding Tools

Using Sandboxes

Publishing Your Environment

Next Steps

Environments Guide

Evaluation Guide

Training Guide

API Reference

Common Patterns

Troubleshooting

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Integrations

​Prerequisites

​Setup Your Workspace

​Create Your First Environment

​Run Your First Evaluation

​Working with Existing Environments

​Environment Types

SingleTurnEnv

ToolEnv

StatefulToolEnv

MultiTurnEnv

​Building Complex Environments

​Adding Tools

​Using Sandboxes

​Publishing Your Environment

​Next Steps

Environments Guide

Evaluation Guide

Training Guide

API Reference

​Common Patterns

​Troubleshooting

Build docs developers (and LLMs) love

Prerequisites

Setup Your Workspace

Create Your First Environment

Run Your First Evaluation

Working with Existing Environments

Environment Types

Building Complex Environments

Adding Tools

Using Sandboxes

Publishing Your Environment

Next Steps

Common Patterns

Troubleshooting