SingleTurnEnv
Environment for single-turn tasks where the model generates a single response to a prompt.
Overview
SingleTurnEnv is the simplest environment type, designed for Q&A tasks where:
- The model receives a prompt and generates a single response
- No multi-turn interaction is needed
- Scoring is based on the single response
This class inherits from MultiTurnEnv with max_turns=1 and disables env_response().
Inheritance
Environment
└── MultiTurnEnv
└── SingleTurnEnv
Constructor
Accepts all parameters from Environment constructor. The max_turns parameter is automatically set to 1.
Key Differences from MultiTurnEnv
max_turns is fixed at 1 (cannot be overridden)
env_response() raises NotImplementedError if called
- Rollout completes after a single model response
Methods
env_response
async def env_response(
messages: vf.Messages,
state: vf.State,
**kwargs
) -> vf.Messages
This method raises NotImplementedError for SingleTurnEnv. It is not used in single-turn scenarios.
render_completion
async def render_completion(state: vf.State)
Renders the final completion from the trajectory. Inherited from MultiTurnEnv.
Example Usage
import verifiers as vf
from datasets import load_dataset
# Question-answering environment
class QAEnv(vf.SingleTurnEnv):
pass # No custom logic needed for basic Q&A
def load_environment():
# Load GSM8K dataset
dataset = load_dataset("gsm8k", "main", split="train")
# Define reward function
def correct_answer(answer: str, completion: vf.Messages) -> float:
"""Check if the answer appears in the completion."""
completion_text = str(completion)
return 1.0 if answer in completion_text else 0.0
return QAEnv(
dataset=dataset,
rubric=vf.Rubric(correct_answer),
system_prompt="Solve the following math problem."
)
# Usage
env = load_environment()
results = await env.evaluate(
client=vf.ClientConfig(
provider="openai",
api_key="sk-..."
),
model="gpt-4",
num_examples=100,
rollouts_per_example=1
)
print(f"Accuracy: {results['metadata']['avg_reward']:.2%}")
import verifiers as vf
def load_environment():
# Create custom dataset
dataset = vf.Environment.make_dataset(
[
{
"question": "What is 2+2?",
"answer": "4",
},
{
"question": "What is the capital of France?",
"answer": "Paris",
},
]
)
def exact_match(answer: str, completion: vf.Messages) -> float:
completion_text = str(completion).strip()
return 1.0 if answer.lower() in completion_text.lower() else 0.0
return vf.SingleTurnEnv(
dataset=dataset,
rubric=vf.Rubric(exact_match),
system_prompt="Answer the following question concisely."
)
With Multiple Metrics
import verifiers as vf
from datasets import load_dataset
def load_environment():
dataset = load_dataset("gsm8k", "main", split="train")
def correctness(answer: str, completion: vf.Messages) -> float:
"""Primary reward: answer correctness."""
return 1.0 if answer in str(completion) else 0.0
def response_length(completion: vf.Messages) -> int:
"""Metric: response length in characters."""
return len(str(completion))
def has_explanation(completion: vf.Messages) -> float:
"""Metric: whether response includes explanation keywords."""
text = str(completion).lower()
keywords = ["because", "therefore", "since", "so"]
return 1.0 if any(kw in text for kw in keywords) else 0.0
return vf.SingleTurnEnv(
dataset=dataset,
rubric=vf.Rubric(
correctness, # Primary reward
response_length,
has_explanation,
),
system_prompt="Solve the problem and explain your reasoning."
)
With Few-Shot Examples
import verifiers as vf
from datasets import load_dataset
def load_environment():
dataset = load_dataset("gsm8k", "main", split="train")
# Few-shot examples
few_shot = [
{
"role": "user",
"content": "What is 10 + 5?"
},
{
"role": "assistant",
"content": "The answer is 15."
},
{
"role": "user",
"content": "What is 20 - 3?"
},
{
"role": "assistant",
"content": "The answer is 17."
},
]
def correct_answer(answer: str, completion: vf.Messages) -> float:
return 1.0 if answer in str(completion) else 0.0
return vf.SingleTurnEnv(
dataset=dataset,
rubric=vf.Rubric(correct_answer),
system_prompt="Solve the following math problem.",
few_shot=few_shot
)
Common Patterns
Basic Q&A
Use SingleTurnEnv directly with a dataset and rubric:
env = vf.SingleTurnEnv(
dataset=dataset,
rubric=vf.Rubric(reward_fn),
system_prompt="..."
)
Custom Scoring
Define reward functions that accept fields from your dataset:
def reward_fn(answer: str, completion: vf.Messages) -> float:
# Access dataset fields by parameter name
return compute_score(answer, completion)
rubric = vf.Rubric(reward_fn)
Multiple Rollouts per Example
Generate multiple responses per question for majority voting:
results = await env.evaluate(
client=client,
model="gpt-4",
num_examples=100,
rollouts_per_example=5, # Generate 5 responses per question
sampling_args={"temperature": 0.7}
)
When to Use
Use SingleTurnEnv for:
- Question answering
- Text classification
- Summarization
- Translation
- Any task requiring a single model response
For tasks requiring multiple turns of interaction, use MultiTurnEnv instead.
See Also