OpenEnv Integration

The OpenEnvEnv integration allows you to run OpenEnv environments (supporting both gym-style and MCP contracts) inside Prime Sandboxes using prebuilt container images. OpenEnv environments use seed-based episode generation and can support both step-based (gym) and tool-based (MCP) interaction protocols.

Features

Gym and MCP contracts - Support for both step-based and tool-based environments
Prime Sandboxes integration - Runs in isolated containers
Seed-based episodes - Deterministic episode generation via seeds
Automatic image building - Build and register Docker images with vf-build
Custom prompt rendering - Convert observations to chat messages

Installation

Install with OpenEnv support:

uv add 'verifiers[openenv]'

This installs:

openenv-core - OpenEnv client library
prime-sandboxes - Prime Sandboxes SDK

Quick Start

Create OpenEnv project

Create an OpenEnv environment project with a Dockerfile:

environments/my_openenv/
├── my_openenv.py       # Verifiers environment wrapper
├── pyproject.toml      # Package metadata
├── proj/               # OpenEnv project
│   ├── Dockerfile      # Environment container
│   ├── server.py       # OpenEnv server implementation
│   └── requirements.txt
└── README.md

Build and register image

Build the Docker image and register it with Prime:

uv run vf-build my-openenv

This creates .build.json with image metadata:

{
  "image": "registry.prime.box/my-openenv:latest",
  "port": 8000,
  "start_command": "python server.py",
  "contract": "gym"
}

Create Verifiers wrapper

Wrap your OpenEnv environment:

import verifiers as vf
from verifiers.envs.integrations.openenv_env import OpenEnvEnv

def render_prompt(observation, **kwargs):
    """Convert observation to chat messages."""
    return [{"role": "user", "content": str(observation)}]

def load_environment():
    return OpenEnvEnv(
        openenv_project="./proj",
        prompt_renderer=render_prompt,
        num_train_examples=100,
        num_eval_examples=50,
        seed=0,
    )

Evaluate

Run an evaluation:

prime eval run my-openenv -m openai/gpt-4.1-mini -n 5

Gym Contract

For gym-style environments with reset() and step(action) methods:

Server Implementation

# proj/server.py
from fastapi import FastAPI
import gymnasium as gym

app = FastAPI()
env = None

@app.get("/health")
def health():
    return {"status": "ok"}

@app.get("/schema")
def schema():
    return {
        "observation": {"type": "string"},
        "action": {
            "type": "object",
            "properties": {
                "action": {"type": "string"}
            },
            "required": ["action"]
        }
    }

@app.post("/reset")
def reset(seed: int = 0):
    global env
    env = gym.make("CartPole-v1")
    obs, info = env.reset(seed=seed)
    return {
        "observation": str(obs),
        "info": info,
        "done": False
    }

@app.post("/step")
def step(action: dict):
    obs, reward, terminated, truncated, info = env.step(
        int(action["action"])
    )
    return {
        "observation": str(obs),
        "reward": reward,
        "done": terminated or truncated,
        "info": info
    }

Dockerfile

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY server.py .

CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8000"]

Build Manifest

{
  "image": "registry.prime.box/my-gym-env:latest",
  "port": 8000,
  "start_command": "uvicorn server:app --host 0.0.0.0 --port 8000",
  "contract": "gym"
}

Prompt Renderer for Gym

def render_gym_prompt(observation, context, **kwargs):
    """Render gym observations as messages.
    
    Args:
        observation: The observation from reset/step
        context: 'reset' or 'step'
    """
    if context == "reset":
        return [{
            "role": "user",
            "content": f"Environment reset. Observation: {observation}"
        }]
    else:
        return [{
            "role": "user",
            "content": f"Observation: {observation}"
        }]

MCP Contract

For tool-based environments using the Model Context Protocol:

Server Implementation

# proj/server.py
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()
game_state = None

@app.get("/health")
def health():
    return {"status": "ok"}

@app.get("/schema")
def schema():
    return {
        "observation": {"type": "object"},
        "action": {
            "type": "object",
            "properties": {
                "type": {"enum": ["call_tool"]},
                "tool_name": {"type": "string"},
                "arguments": {"type": "object"}
            }
        }
    }

@app.post("/reset")
def reset(seed: int = 0):
    global game_state
    game_state = initialize_game(seed)
    return {
        "observation": game_state,
        "done": False
    }

@app.post("/step")
def step(action: dict):
    tool_name = action["tool_name"]
    args = action["arguments"]
    
    if tool_name == "make_move":
        result = make_move(game_state, args["move"])
        reward = result["reward"]
        done = result["game_over"]
        
        return {
            "observation": result,
            "reward": reward,
            "done": done
        }
    
    return {
        "observation": {"error": "Unknown tool"},
        "reward": 0.0,
        "done": False
    }

@app.post("/mcp")
def mcp(request: dict):
    """MCP protocol endpoint."""
    method = request["data"]["method"]
    params = request["data"]["params"]
    
    if method == "tools/list":
        return {
            "data": {
                "jsonrpc": "2.0",
                "id": request["data"]["id"],
                "result": {
                    "tools": [
                        {
                            "name": "make_move",
                            "description": "Make a move in the game",
                            "inputSchema": {
                                "type": "object",
                                "properties": {
                                    "move": {"type": "string"}
                                },
                                "required": ["move"]
                            }
                        }
                    ]
                }
            }
        }

Build Manifest for MCP

{
  "image": "registry.prime.box/my-mcp-env:latest",
  "port": 8000,
  "start_command": "uvicorn server:app --host 0.0.0.0 --port 8000",
  "contract": "mcp"
}

Prompt Renderer for MCP

MCP environments automatically inject tool definitions, so prompt renderer just needs to format observations:

def render_mcp_prompt(observation, context, **kwargs):
    """Render MCP observations as messages."""
    if context == "reset":
        return [{
            "role": "user",
            "content": f"Game started. State: {observation}"
        }]
    else:
        # Tool responses are handled automatically
        return []

OpenEnvEnv Configuration

OpenEnvEnv(
    openenv_project="./proj",              # Path to OpenEnv project
    prompt_renderer=render_prompt,          # Required: observation -> messages
    num_train_examples=100,                 # Training seeds
    num_eval_examples=50,                   # Evaluation seeds
    seed=0,                                 # Base seed
    max_turns=10,                           # Max turns per episode
    rubric=custom_rubric,                   # Optional custom rubric
    startup_timeout_seconds=30,             # Container startup timeout
    startup_poll_interval_seconds=1.0,      # Health check interval
)

Custom Rubrics

By default, OpenEnvEnv uses OpenEnvEpisodicSumRubric which sums per-step rewards. Create custom rubrics for different scoring:

import verifiers as vf

async def final_score(state: vf.State) -> float:
    """Use only the final step reward."""
    trajectory = state.get("trajectory", [])
    if not trajectory:
        return 0.0
    return float(trajectory[-1].get("reward", 0.0))

custom_rubric = vf.Rubric(funcs=[final_score])

env = OpenEnvEnv(
    openenv_project="./proj",
    prompt_renderer=render_prompt,
    rubric=custom_rubric,
)

Examples

See example OpenEnv integrations in the Verifiers repository:

openenv-textarena - TextArena Wordle game via OpenEnv
openenv-echo - Simple echo server example

Building Images

The vf-build command:

Builds the Docker image from proj/Dockerfile
Tags it with your environment name
Pushes to Prime registry
Creates .build.json with metadata

# Build from default path
uv run vf-build my-openenv

# Build from custom path
uv run vf-build my-openenv -p /path/to/environments

# Rebuild without cache
uv run vf-build my-openenv --no-cache

Best Practices

The prompt_renderer is required and must return non-empty chat messages. OpenEnv makes no assumptions about how observations should be presented to the model.

Health checks - Implement /health endpoint that returns 200 when ready
Schema validation - Return proper JSON schema from /schema
Error handling - Return errors in observation field, not as HTTP errors
Deterministic resets - Use the seed parameter for reproducible episodes
Action validation - Validate actions match your schema before processing

Troubleshooting

Sandbox Not Starting

Check logs:

uv run vf-build my-openenv --verbose

Common issues:

Missing /health endpoint
Server not binding to 0.0.0.0
Port mismatch in .build.json

Contract Mismatch

If you see “contract mismatch” errors:

Verify .build.json has correct contract field
For gym: action schema should not have tool_name field
For MCP: action schema should have tool_name and arguments fields

Prompt Renderer Errors

Prompt renderer must:

Accept observation as first argument
Return list of message dicts with role and content
Return non-empty list
Handle both reset and step contexts

Get Started

Core Concepts

Guides

Integrations

OpenEnv Integration

Features

Installation

Quick Start

Gym Contract

Server Implementation

Dockerfile

Build Manifest

Prompt Renderer for Gym

MCP Contract

Server Implementation

Build Manifest for MCP

Prompt Renderer for MCP

OpenEnvEnv Configuration

Custom Rubrics

Examples

Building Images

Best Practices

Troubleshooting

Sandbox Not Starting

Contract Mismatch

Prompt Renderer Errors

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Integrations

​Features

​Installation

​Quick Start

​Gym Contract

​Server Implementation

​Dockerfile

​Build Manifest

​Prompt Renderer for Gym

​MCP Contract

​Server Implementation

​Build Manifest for MCP

​Prompt Renderer for MCP

​OpenEnvEnv Configuration

​Custom Rubrics

​Examples

​Building Images

​Best Practices

​Troubleshooting

​Sandbox Not Starting

​Contract Mismatch

​Prompt Renderer Errors

Build docs developers (and LLMs) love

Features

Installation

Quick Start

Gym Contract

Server Implementation

Dockerfile

Build Manifest

Prompt Renderer for Gym

MCP Contract

Server Implementation

Build Manifest for MCP

Prompt Renderer for MCP

OpenEnvEnv Configuration

Custom Rubrics

Examples

Building Images

Best Practices

Troubleshooting

Sandbox Not Starting

Contract Mismatch

Prompt Renderer Errors