Skip to main content
The OpenEnvEnv integration allows you to run OpenEnv environments (supporting both gym-style and MCP contracts) inside Prime Sandboxes using prebuilt container images. OpenEnv environments use seed-based episode generation and can support both step-based (gym) and tool-based (MCP) interaction protocols.

Features

  • Gym and MCP contracts - Support for both step-based and tool-based environments
  • Prime Sandboxes integration - Runs in isolated containers
  • Seed-based episodes - Deterministic episode generation via seeds
  • Automatic image building - Build and register Docker images with vf-build
  • Custom prompt rendering - Convert observations to chat messages

Installation

Install with OpenEnv support:
uv add 'verifiers[openenv]'
This installs:
  • openenv-core - OpenEnv client library
  • prime-sandboxes - Prime Sandboxes SDK

Quick Start

1

Create OpenEnv project

Create an OpenEnv environment project with a Dockerfile:
environments/my_openenv/
├── my_openenv.py       # Verifiers environment wrapper
├── pyproject.toml      # Package metadata
├── proj/               # OpenEnv project
│   ├── Dockerfile      # Environment container
│   ├── server.py       # OpenEnv server implementation
│   └── requirements.txt
└── README.md
2

Build and register image

Build the Docker image and register it with Prime:
uv run vf-build my-openenv
This creates .build.json with image metadata:
{
  "image": "registry.prime.box/my-openenv:latest",
  "port": 8000,
  "start_command": "python server.py",
  "contract": "gym"
}
3

Create Verifiers wrapper

Wrap your OpenEnv environment:
import verifiers as vf
from verifiers.envs.integrations.openenv_env import OpenEnvEnv

def render_prompt(observation, **kwargs):
    """Convert observation to chat messages."""
    return [{"role": "user", "content": str(observation)}]

def load_environment():
    return OpenEnvEnv(
        openenv_project="./proj",
        prompt_renderer=render_prompt,
        num_train_examples=100,
        num_eval_examples=50,
        seed=0,
    )
4

Evaluate

Run an evaluation:
prime eval run my-openenv -m openai/gpt-4.1-mini -n 5

Gym Contract

For gym-style environments with reset() and step(action) methods:

Server Implementation

# proj/server.py
from fastapi import FastAPI
import gymnasium as gym

app = FastAPI()
env = None

@app.get("/health")
def health():
    return {"status": "ok"}

@app.get("/schema")
def schema():
    return {
        "observation": {"type": "string"},
        "action": {
            "type": "object",
            "properties": {
                "action": {"type": "string"}
            },
            "required": ["action"]
        }
    }

@app.post("/reset")
def reset(seed: int = 0):
    global env
    env = gym.make("CartPole-v1")
    obs, info = env.reset(seed=seed)
    return {
        "observation": str(obs),
        "info": info,
        "done": False
    }

@app.post("/step")
def step(action: dict):
    obs, reward, terminated, truncated, info = env.step(
        int(action["action"])
    )
    return {
        "observation": str(obs),
        "reward": reward,
        "done": terminated or truncated,
        "info": info
    }

Dockerfile

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY server.py .

CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8000"]

Build Manifest

{
  "image": "registry.prime.box/my-gym-env:latest",
  "port": 8000,
  "start_command": "uvicorn server:app --host 0.0.0.0 --port 8000",
  "contract": "gym"
}

Prompt Renderer for Gym

def render_gym_prompt(observation, context, **kwargs):
    """Render gym observations as messages.
    
    Args:
        observation: The observation from reset/step
        context: 'reset' or 'step'
    """
    if context == "reset":
        return [{
            "role": "user",
            "content": f"Environment reset. Observation: {observation}"
        }]
    else:
        return [{
            "role": "user",
            "content": f"Observation: {observation}"
        }]

MCP Contract

For tool-based environments using the Model Context Protocol:

Server Implementation

# proj/server.py
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()
game_state = None

@app.get("/health")
def health():
    return {"status": "ok"}

@app.get("/schema")
def schema():
    return {
        "observation": {"type": "object"},
        "action": {
            "type": "object",
            "properties": {
                "type": {"enum": ["call_tool"]},
                "tool_name": {"type": "string"},
                "arguments": {"type": "object"}
            }
        }
    }

@app.post("/reset")
def reset(seed: int = 0):
    global game_state
    game_state = initialize_game(seed)
    return {
        "observation": game_state,
        "done": False
    }

@app.post("/step")
def step(action: dict):
    tool_name = action["tool_name"]
    args = action["arguments"]
    
    if tool_name == "make_move":
        result = make_move(game_state, args["move"])
        reward = result["reward"]
        done = result["game_over"]
        
        return {
            "observation": result,
            "reward": reward,
            "done": done
        }
    
    return {
        "observation": {"error": "Unknown tool"},
        "reward": 0.0,
        "done": False
    }

@app.post("/mcp")
def mcp(request: dict):
    """MCP protocol endpoint."""
    method = request["data"]["method"]
    params = request["data"]["params"]
    
    if method == "tools/list":
        return {
            "data": {
                "jsonrpc": "2.0",
                "id": request["data"]["id"],
                "result": {
                    "tools": [
                        {
                            "name": "make_move",
                            "description": "Make a move in the game",
                            "inputSchema": {
                                "type": "object",
                                "properties": {
                                    "move": {"type": "string"}
                                },
                                "required": ["move"]
                            }
                        }
                    ]
                }
            }
        }

Build Manifest for MCP

{
  "image": "registry.prime.box/my-mcp-env:latest",
  "port": 8000,
  "start_command": "uvicorn server:app --host 0.0.0.0 --port 8000",
  "contract": "mcp"
}

Prompt Renderer for MCP

MCP environments automatically inject tool definitions, so prompt renderer just needs to format observations:
def render_mcp_prompt(observation, context, **kwargs):
    """Render MCP observations as messages."""
    if context == "reset":
        return [{
            "role": "user",
            "content": f"Game started. State: {observation}"
        }]
    else:
        # Tool responses are handled automatically
        return []

OpenEnvEnv Configuration

OpenEnvEnv(
    openenv_project="./proj",              # Path to OpenEnv project
    prompt_renderer=render_prompt,          # Required: observation -> messages
    num_train_examples=100,                 # Training seeds
    num_eval_examples=50,                   # Evaluation seeds
    seed=0,                                 # Base seed
    max_turns=10,                           # Max turns per episode
    rubric=custom_rubric,                   # Optional custom rubric
    startup_timeout_seconds=30,             # Container startup timeout
    startup_poll_interval_seconds=1.0,      # Health check interval
)

Custom Rubrics

By default, OpenEnvEnv uses OpenEnvEpisodicSumRubric which sums per-step rewards. Create custom rubrics for different scoring:
import verifiers as vf

async def final_score(state: vf.State) -> float:
    """Use only the final step reward."""
    trajectory = state.get("trajectory", [])
    if not trajectory:
        return 0.0
    return float(trajectory[-1].get("reward", 0.0))

custom_rubric = vf.Rubric(funcs=[final_score])

env = OpenEnvEnv(
    openenv_project="./proj",
    prompt_renderer=render_prompt,
    rubric=custom_rubric,
)

Examples

See example OpenEnv integrations in the Verifiers repository:

Building Images

The vf-build command:
  1. Builds the Docker image from proj/Dockerfile
  2. Tags it with your environment name
  3. Pushes to Prime registry
  4. Creates .build.json with metadata
# Build from default path
uv run vf-build my-openenv

# Build from custom path
uv run vf-build my-openenv -p /path/to/environments

# Rebuild without cache
uv run vf-build my-openenv --no-cache

Best Practices

The prompt_renderer is required and must return non-empty chat messages. OpenEnv makes no assumptions about how observations should be presented to the model.
  • Health checks - Implement /health endpoint that returns 200 when ready
  • Schema validation - Return proper JSON schema from /schema
  • Error handling - Return errors in observation field, not as HTTP errors
  • Deterministic resets - Use the seed parameter for reproducible episodes
  • Action validation - Validate actions match your schema before processing

Troubleshooting

Sandbox Not Starting

Check logs:
uv run vf-build my-openenv --verbose
Common issues:
  • Missing /health endpoint
  • Server not binding to 0.0.0.0
  • Port mismatch in .build.json

Contract Mismatch

If you see “contract mismatch” errors:
  • Verify .build.json has correct contract field
  • For gym: action schema should not have tool_name field
  • For MCP: action schema should have tool_name and arguments fields

Prompt Renderer Errors

Prompt renderer must:
  • Accept observation as first argument
  • Return list of message dicts with role and content
  • Return non-empty list
  • Handle both reset and step contexts

Build docs developers (and LLMs) love