TextArenaEnv

Wrapper environment for TextArena text-based games.

Overview

TextArenaEnv wraps TextArena game environments for multi-turn interaction with language models. It automatically converts TextArena games into Verifiers datasets and handles game state management. Key features:

Automatic dataset generation from TextArena word lists
Efficient memory sharing for parallel rollouts
Custom feedback transformation via feedback_fn
Built-in XML parser for structured responses

Installation

Install with TextArena support:

uv add 'verifiers[ta]'

Or when developing in the verifiers repo:

uv sync --extra ta

See the TextArena integration guide for setup details.

Inheritance

Environment
└── MultiTurnEnv
    └── TextArenaEnv

Constructor

TextArenaEnv(
    game: str = "Wordle-v0",
    num_train_examples: int = 1000,
    num_eval_examples: int = 0,
    system_prompt: str | None = None,
    parser: vf.XMLParser | None = None,
    rubric: vf.Rubric | None = None,
    feedback_fn: Callable[[str], str] = lambda x: x,
    seed: int = 0,
    **kwargs
)

Parameters

game

str

default:"Wordle-v0"

TextArena game ID (e.g., “Wordle-v0”, “TwentyQuestions-v0”).

num_train_examples

int

default:"1000"

Number of training examples to generate.

num_eval_examples

int

default:"0"

Number of evaluation examples to generate.

system_prompt

str | None

default:"None"

System prompt for the model. If None, uses default from MultiTurnEnv.

parser

vf.XMLParser | None

default:"None"

Parser for model responses. If None, uses XMLParser(fields=["think", "guess"], answer_field="guess").

rubric

vf.Rubric | None

default:"None"

Rubric for scoring. If None, uses default rubric.

feedback_fn

Callable[[str], str]

default:"lambda x: x"

Function to transform TextArena observations before presenting to the model. Use this to filter or reformat game state messages.

seed

int

default:"0"

Random seed for dataset generation.

**kwargs

Any

Additional arguments passed to MultiTurnEnv.

Key Methods

setup_state

async def setup_state(
    state: vf.State,
    **kwargs
) -> vf.State

Initialize TextArena environment for this rollout. Implementation details:

Creates a deep copy of the TextArena environment with memory sharing optimization
Sets the secret word from state["answer"]
Stores environment in state["ta_env"]

env_response

async def env_response(
    messages: vf.Messages,
    state: vf.State,
    **kwargs
) -> vf.Messages

Process model’s guess and return game feedback. Flow:

Parse guess from latest message using parser.parse_answer()
Step the TextArena environment with the guess
If game is done, set state["final_env_response"] and return terminal message
Otherwise, get observation and apply feedback_fn before returning

cleanup_ta_env

@vf.cleanup
async def cleanup_ta_env(state: vf.State)

Clean up TextArena environment after rollout by removing ta_env from state.

Example Usage

Basic Wordle Environment

import verifiers as vf
from verifiers.envs.integrations.textarena_env import TextArenaEnv

def load_environment():
    return TextArenaEnv(
        game="Wordle-v0",
        num_train_examples=1000,
        num_eval_examples=100,
        seed=0,
    )

Custom Feedback Function

import verifiers as vf
from verifiers.envs.integrations.textarena_env import TextArenaEnv

def simplify_feedback(observation: str) -> str:
    """Transform TextArena observation to simpler format."""
    # TextArena often returns full game history,
    # but we only want the latest feedback
    if "Correct!" in observation:
        return "Correct!"
    elif "letters in correct positions" in observation:
        # Extract just the color hints
        lines = observation.split("\n")
        return lines[-1]  # Return just the hint line
    return observation

def load_environment():
    return TextArenaEnv(
        game="Wordle-v0",
        feedback_fn=simplify_feedback,
        num_train_examples=1000,
        system_prompt="You are playing Wordle. Guess 5-letter words. Use <think> tags for reasoning and <guess> for your answer.",
    )

Custom Parser and Rubric

import verifiers as vf
from verifiers.envs.integrations.textarena_env import TextArenaEnv

def load_environment():
    parser = vf.XMLParser(
        fields=["reasoning", "guess"],
        answer_field="guess"
    )
    
    def success_reward(state: vf.State) -> float:
        """Reward winning in fewer guesses."""
        num_guesses = len(state["trajectory"])
        if state.get("final_env_response"):
            # Won - reward fewer guesses
            return 1.0 / num_guesses
        return 0.0
    
    rubric = vf.Rubric(success_reward)
    
    return TextArenaEnv(
        game="Wordle-v0",
        parser=parser,
        rubric=rubric,
        num_train_examples=1000,
    )

TwentyQuestions Game

import verifiers as vf
from verifiers.envs.integrations.textarena_env import TextArenaEnv

def load_environment():
    parser = vf.XMLParser(
        fields=["think", "question"],
        answer_field="question"
    )
    
    return TextArenaEnv(
        game="TwentyQuestions-v0",
        parser=parser,
        num_train_examples=500,
        num_eval_examples=100,
        system_prompt="You are playing 20 Questions. Ask yes/no questions to guess the secret word. Use <think> for strategy and <question> for your question.",
    )

Memory Optimization

TextArenaEnv uses build_shared_memo() to share immutable data across environment copies:

Problem: TextArena’s EnglishDictionary holds ~430K strings in 4 sets (~38MB). Without sharing, each rollout copies this data (~120ms + 38MB per copy).
Solution: The shared memo dict allows deep copying to share these immutable objects, saving significant memory and time during parallel rollouts.

This optimization is automatic and requires no user configuration.

Available Games

Some popular TextArena games:

Wordle-v0 - Classic word guessing game
TwentyQuestions-v0 - 20 questions game
Poker-v0 - Poker game
Many more available in the TextArena repository

Check the TextArena documentation for the full list of available games.

Environment Classes

Rubrics & Parsers

Clients

Integration Classes

Experimental

Data Types

Utilities

TextArenaEnv

TextArenaEnv

Overview

Installation

Inheritance

Constructor

Parameters

Key Methods

setup_state

env_response

cleanup_ta_env

Example Usage

Basic Wordle Environment

Custom Feedback Function

Custom Parser and Rubric

TwentyQuestions Game

Memory Optimization

Available Games

See Also

Build docs developers (and LLMs) love

Environment Classes

Rubrics & Parsers

Clients

Integration Classes

Experimental

Data Types

Utilities

​TextArenaEnv

​Overview

​Installation

​Inheritance

​Constructor

​Parameters

​Key Methods

​setup_state

​env_response

​cleanup_ta_env

​Example Usage

​Basic Wordle Environment

​Custom Feedback Function

​Custom Parser and Rubric

​TwentyQuestions Game

​Memory Optimization

​Available Games

​See Also

Build docs developers (and LLMs) love

TextArenaEnv

Overview

Installation

Inheritance

Constructor

Parameters

Key Methods

setup_state

env_response

cleanup_ta_env

Example Usage

Basic Wordle Environment

Custom Feedback Function

Custom Parser and Rubric

TwentyQuestions Game

Memory Optimization

Available Games

See Also