TextArenaEnv
Wrapper environment for TextArena text-based games.
Overview
TextArenaEnv wraps TextArena game environments for multi-turn interaction with language models. It automatically converts TextArena games into Verifiers datasets and handles game state management.
Key features:
- Automatic dataset generation from TextArena word lists
- Efficient memory sharing for parallel rollouts
- Custom feedback transformation via
feedback_fn
- Built-in XML parser for structured responses
Installation
Install with TextArena support:
Or when developing in the verifiers repo:
See the TextArena integration guide for setup details.
Inheritance
Environment
└── MultiTurnEnv
└── TextArenaEnv
Constructor
TextArenaEnv(
game: str = "Wordle-v0",
num_train_examples: int = 1000,
num_eval_examples: int = 0,
system_prompt: str | None = None,
parser: vf.XMLParser | None = None,
rubric: vf.Rubric | None = None,
feedback_fn: Callable[[str], str] = lambda x: x,
seed: int = 0,
**kwargs
)
Parameters
TextArena game ID (e.g., “Wordle-v0”, “TwentyQuestions-v0”).
Number of training examples to generate.
Number of evaluation examples to generate.
System prompt for the model. If None, uses default from MultiTurnEnv.
parser
vf.XMLParser | None
default:"None"
Parser for model responses. If None, uses XMLParser(fields=["think", "guess"], answer_field="guess").
rubric
vf.Rubric | None
default:"None"
Rubric for scoring. If None, uses default rubric.
feedback_fn
Callable[[str], str]
default:"lambda x: x"
Function to transform TextArena observations before presenting to the model. Use this to filter or reformat game state messages.
Random seed for dataset generation.
Key Methods
setup_state
async def setup_state(
state: vf.State,
**kwargs
) -> vf.State
Initialize TextArena environment for this rollout.
Implementation details:
- Creates a deep copy of the TextArena environment with memory sharing optimization
- Sets the secret word from
state["answer"]
- Stores environment in
state["ta_env"]
env_response
async def env_response(
messages: vf.Messages,
state: vf.State,
**kwargs
) -> vf.Messages
Process model’s guess and return game feedback.
Flow:
- Parse guess from latest message using
parser.parse_answer()
- Step the TextArena environment with the guess
- If game is done, set
state["final_env_response"] and return terminal message
- Otherwise, get observation and apply
feedback_fn before returning
cleanup_ta_env
@vf.cleanup
async def cleanup_ta_env(state: vf.State)
Clean up TextArena environment after rollout by removing ta_env from state.
Example Usage
Basic Wordle Environment
import verifiers as vf
from verifiers.envs.integrations.textarena_env import TextArenaEnv
def load_environment():
return TextArenaEnv(
game="Wordle-v0",
num_train_examples=1000,
num_eval_examples=100,
seed=0,
)
Custom Feedback Function
import verifiers as vf
from verifiers.envs.integrations.textarena_env import TextArenaEnv
def simplify_feedback(observation: str) -> str:
"""Transform TextArena observation to simpler format."""
# TextArena often returns full game history,
# but we only want the latest feedback
if "Correct!" in observation:
return "Correct!"
elif "letters in correct positions" in observation:
# Extract just the color hints
lines = observation.split("\n")
return lines[-1] # Return just the hint line
return observation
def load_environment():
return TextArenaEnv(
game="Wordle-v0",
feedback_fn=simplify_feedback,
num_train_examples=1000,
system_prompt="You are playing Wordle. Guess 5-letter words. Use <think> tags for reasoning and <guess> for your answer.",
)
Custom Parser and Rubric
import verifiers as vf
from verifiers.envs.integrations.textarena_env import TextArenaEnv
def load_environment():
parser = vf.XMLParser(
fields=["reasoning", "guess"],
answer_field="guess"
)
def success_reward(state: vf.State) -> float:
"""Reward winning in fewer guesses."""
num_guesses = len(state["trajectory"])
if state.get("final_env_response"):
# Won - reward fewer guesses
return 1.0 / num_guesses
return 0.0
rubric = vf.Rubric(success_reward)
return TextArenaEnv(
game="Wordle-v0",
parser=parser,
rubric=rubric,
num_train_examples=1000,
)
TwentyQuestions Game
import verifiers as vf
from verifiers.envs.integrations.textarena_env import TextArenaEnv
def load_environment():
parser = vf.XMLParser(
fields=["think", "question"],
answer_field="question"
)
return TextArenaEnv(
game="TwentyQuestions-v0",
parser=parser,
num_train_examples=500,
num_eval_examples=100,
system_prompt="You are playing 20 Questions. Ask yes/no questions to guess the secret word. Use <think> for strategy and <question> for your question.",
)
Memory Optimization
TextArenaEnv uses build_shared_memo() to share immutable data across environment copies:
- Problem: TextArena’s EnglishDictionary holds ~430K strings in 4 sets (~38MB). Without sharing, each rollout copies this data (~120ms + 38MB per copy).
- Solution: The shared memo dict allows deep copying to share these immutable objects, saving significant memory and time during parallel rollouts.
This optimization is automatic and requires no user configuration.
Available Games
Some popular TextArena games:
Wordle-v0 - Classic word guessing game
TwentyQuestions-v0 - 20 questions game
Poker-v0 - Poker game
- Many more available in the TextArena repository
Check the TextArena documentation for the full list of available games.
See Also