The OpenEnvEnv integration allows you to run OpenEnv environments (supporting both gym-style and MCP contracts) inside Prime Sandboxes using prebuilt container images.
OpenEnv environments use seed-based episode generation and can support both step-based (gym) and tool-based (MCP) interaction protocols.
Features
- Gym and MCP contracts - Support for both step-based and tool-based environments
- Prime Sandboxes integration - Runs in isolated containers
- Seed-based episodes - Deterministic episode generation via seeds
- Automatic image building - Build and register Docker images with
vf-build
- Custom prompt rendering - Convert observations to chat messages
Installation
Install with OpenEnv support:
uv add 'verifiers[openenv]'
This installs:
openenv-core - OpenEnv client library
prime-sandboxes - Prime Sandboxes SDK
Quick Start
Create OpenEnv project
Create an OpenEnv environment project with a Dockerfile:environments/my_openenv/
├── my_openenv.py # Verifiers environment wrapper
├── pyproject.toml # Package metadata
├── proj/ # OpenEnv project
│ ├── Dockerfile # Environment container
│ ├── server.py # OpenEnv server implementation
│ └── requirements.txt
└── README.md
Build and register image
Build the Docker image and register it with Prime:uv run vf-build my-openenv
This creates .build.json with image metadata:{
"image": "registry.prime.box/my-openenv:latest",
"port": 8000,
"start_command": "python server.py",
"contract": "gym"
}
Create Verifiers wrapper
Wrap your OpenEnv environment:import verifiers as vf
from verifiers.envs.integrations.openenv_env import OpenEnvEnv
def render_prompt(observation, **kwargs):
"""Convert observation to chat messages."""
return [{"role": "user", "content": str(observation)}]
def load_environment():
return OpenEnvEnv(
openenv_project="./proj",
prompt_renderer=render_prompt,
num_train_examples=100,
num_eval_examples=50,
seed=0,
)
Evaluate
Run an evaluation:prime eval run my-openenv -m openai/gpt-4.1-mini -n 5
Gym Contract
For gym-style environments with reset() and step(action) methods:
Server Implementation
# proj/server.py
from fastapi import FastAPI
import gymnasium as gym
app = FastAPI()
env = None
@app.get("/health")
def health():
return {"status": "ok"}
@app.get("/schema")
def schema():
return {
"observation": {"type": "string"},
"action": {
"type": "object",
"properties": {
"action": {"type": "string"}
},
"required": ["action"]
}
}
@app.post("/reset")
def reset(seed: int = 0):
global env
env = gym.make("CartPole-v1")
obs, info = env.reset(seed=seed)
return {
"observation": str(obs),
"info": info,
"done": False
}
@app.post("/step")
def step(action: dict):
obs, reward, terminated, truncated, info = env.step(
int(action["action"])
)
return {
"observation": str(obs),
"reward": reward,
"done": terminated or truncated,
"info": info
}
Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY server.py .
CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8000"]
Build Manifest
{
"image": "registry.prime.box/my-gym-env:latest",
"port": 8000,
"start_command": "uvicorn server:app --host 0.0.0.0 --port 8000",
"contract": "gym"
}
Prompt Renderer for Gym
def render_gym_prompt(observation, context, **kwargs):
"""Render gym observations as messages.
Args:
observation: The observation from reset/step
context: 'reset' or 'step'
"""
if context == "reset":
return [{
"role": "user",
"content": f"Environment reset. Observation: {observation}"
}]
else:
return [{
"role": "user",
"content": f"Observation: {observation}"
}]
MCP Contract
For tool-based environments using the Model Context Protocol:
Server Implementation
# proj/server.py
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI()
game_state = None
@app.get("/health")
def health():
return {"status": "ok"}
@app.get("/schema")
def schema():
return {
"observation": {"type": "object"},
"action": {
"type": "object",
"properties": {
"type": {"enum": ["call_tool"]},
"tool_name": {"type": "string"},
"arguments": {"type": "object"}
}
}
}
@app.post("/reset")
def reset(seed: int = 0):
global game_state
game_state = initialize_game(seed)
return {
"observation": game_state,
"done": False
}
@app.post("/step")
def step(action: dict):
tool_name = action["tool_name"]
args = action["arguments"]
if tool_name == "make_move":
result = make_move(game_state, args["move"])
reward = result["reward"]
done = result["game_over"]
return {
"observation": result,
"reward": reward,
"done": done
}
return {
"observation": {"error": "Unknown tool"},
"reward": 0.0,
"done": False
}
@app.post("/mcp")
def mcp(request: dict):
"""MCP protocol endpoint."""
method = request["data"]["method"]
params = request["data"]["params"]
if method == "tools/list":
return {
"data": {
"jsonrpc": "2.0",
"id": request["data"]["id"],
"result": {
"tools": [
{
"name": "make_move",
"description": "Make a move in the game",
"inputSchema": {
"type": "object",
"properties": {
"move": {"type": "string"}
},
"required": ["move"]
}
}
]
}
}
}
Build Manifest for MCP
{
"image": "registry.prime.box/my-mcp-env:latest",
"port": 8000,
"start_command": "uvicorn server:app --host 0.0.0.0 --port 8000",
"contract": "mcp"
}
Prompt Renderer for MCP
MCP environments automatically inject tool definitions, so prompt renderer just needs to format observations:
def render_mcp_prompt(observation, context, **kwargs):
"""Render MCP observations as messages."""
if context == "reset":
return [{
"role": "user",
"content": f"Game started. State: {observation}"
}]
else:
# Tool responses are handled automatically
return []
OpenEnvEnv Configuration
OpenEnvEnv(
openenv_project="./proj", # Path to OpenEnv project
prompt_renderer=render_prompt, # Required: observation -> messages
num_train_examples=100, # Training seeds
num_eval_examples=50, # Evaluation seeds
seed=0, # Base seed
max_turns=10, # Max turns per episode
rubric=custom_rubric, # Optional custom rubric
startup_timeout_seconds=30, # Container startup timeout
startup_poll_interval_seconds=1.0, # Health check interval
)
Custom Rubrics
By default, OpenEnvEnv uses OpenEnvEpisodicSumRubric which sums per-step rewards. Create custom rubrics for different scoring:
import verifiers as vf
async def final_score(state: vf.State) -> float:
"""Use only the final step reward."""
trajectory = state.get("trajectory", [])
if not trajectory:
return 0.0
return float(trajectory[-1].get("reward", 0.0))
custom_rubric = vf.Rubric(funcs=[final_score])
env = OpenEnvEnv(
openenv_project="./proj",
prompt_renderer=render_prompt,
rubric=custom_rubric,
)
Examples
See example OpenEnv integrations in the Verifiers repository:
Building Images
The vf-build command:
- Builds the Docker image from
proj/Dockerfile
- Tags it with your environment name
- Pushes to Prime registry
- Creates
.build.json with metadata
# Build from default path
uv run vf-build my-openenv
# Build from custom path
uv run vf-build my-openenv -p /path/to/environments
# Rebuild without cache
uv run vf-build my-openenv --no-cache
Best Practices
The prompt_renderer is required and must return non-empty chat messages. OpenEnv makes no assumptions about how observations should be presented to the model.
- Health checks - Implement
/health endpoint that returns 200 when ready
- Schema validation - Return proper JSON schema from
/schema
- Error handling - Return errors in
observation field, not as HTTP errors
- Deterministic resets - Use the
seed parameter for reproducible episodes
- Action validation - Validate actions match your schema before processing
Troubleshooting
Sandbox Not Starting
Check logs:
uv run vf-build my-openenv --verbose
Common issues:
- Missing
/health endpoint
- Server not binding to
0.0.0.0
- Port mismatch in
.build.json
Contract Mismatch
If you see “contract mismatch” errors:
- Verify
.build.json has correct contract field
- For gym: action schema should not have
tool_name field
- For MCP: action schema should have
tool_name and arguments fields
Prompt Renderer Errors
Prompt renderer must:
- Accept
observation as first argument
- Return list of message dicts with
role and content
- Return non-empty list
- Handle both
reset and step contexts