MCPEnv

An environment that exposes MCP server tools to language models using the official MCP SDK.

MCPEnv is experimental and subject to breaking changes. The API may change in future releases.

Overview

MCPEnv connects to MCP servers and exposes their tools to the model as callable functions. It manages:

MCP server lifecycle (connection, tool discovery, cleanup)
Persistent background event loops for server processes
Tool call routing and error handling
Concurrent multi-server support

MCPEnv is designed for globally available, read-only MCP servers where the same toolset can be shared across all rollouts. For per-rollout, stateful servers with mutable task-specific data, consider using a custom environment.

Inheritance

Environment
└── MultiTurnEnv
    └── ToolEnv
        └── MCPEnv

Constructor

MCPEnv(
    mcp_servers: list[MCPServerConfig | dict] = [],
    max_turns: int = 10,
    error_formatter: Callable[[Exception], str] = lambda e: f"Error: {str(e)}",
    **kwargs
)

mcp_servers

list[MCPServerConfig | dict]

default:"[]"

required

List of MCP server configurations. Can be MCPServerConfig objects or dicts with keys: name, command, args, env, description.

max_turns

int

default:"10"

Maximum turns per rollout. Inherited from ToolEnv.

error_formatter

Callable[[Exception], str]

Function to format tool execution errors for the model.

**kwargs

Additional arguments passed to ToolEnv (dataset, rubric, system_prompt, etc.).

MCPServerConfig

@dataclass
class MCPServerConfig:
    name: str
    command: str
    args: list[str] | None = None
    env: dict[str, str] | None = None
    description: str = ""

name

str

required

Unique identifier for the server.

command

str

required

Executable command to start the MCP server (e.g., "uvx", "npx", "python").

args

list[str] | None

Command-line arguments for the server.

env

dict[str, str] | None

Environment variables to pass to the server process.

description

str

Human-readable description of the server’s purpose.

Example Usage

Basic Setup

import verifiers as vf
from verifiers.envs.experimental.mcp_env import MCPServerConfig

def load_environment():
    # Configure MCP servers
    servers = [
        MCPServerConfig(
            name="filesystem",
            command="npx",
            args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
            description="File system operations"
        ),
        MCPServerConfig(
            name="search",
            command="uvx",
            args=["mcp-server-brave-search"],
            env={"BRAVE_API_KEY": "your-key"},
            description="Web search via Brave"
        ),
    ]
    
    # Create dataset
    dataset = vf.Environment.make_dataset([
        {"question": "Search for recent news about AI"},
        {"question": "List files in /tmp"},
    ])
    
    def task_completed(completion: vf.Messages) -> float:
        """Simple completion reward."""
        return 1.0 if len(completion) > 0 else 0.0
    
    return vf.MCPEnv(
        mcp_servers=servers,
        dataset=dataset,
        rubric=vf.Rubric(task_completed),
        max_turns=5,
    )

Using Dict Configs

import verifiers as vf

def load_environment():
    servers = [
        {
            "name": "github",
            "command": "npx",
            "args": ["-y", "@modelcontextprotocol/server-github"],
            "env": {"GITHUB_TOKEN": "ghp_..."},
        },
    ]
    
    dataset = vf.Environment.make_dataset([
        {"question": "List issues in repository owner/repo"},
    ])
    
    return vf.MCPEnv(
        mcp_servers=servers,
        dataset=dataset,
        rubric=vf.Rubric(lambda **kw: 1.0),
    )

Tool Discovery

MCPEnv automatically:

Connects to each server via stdio
Calls list_tools() to discover available tools
Wraps each tool in an MCPToolWrapper instance
Converts MCP tool schemas to vf.Tool format
Registers tools with the environment

Tools are available to the model immediately after initialization.

Tool Call Flow

Error Handling

Tool execution errors are caught and returned as error messages:

# Custom error formatting
def format_error(e: Exception) -> str:
    return f"Tool failed: {type(e).__name__}: {str(e)}"

env = vf.MCPEnv(
    mcp_servers=[...],
    error_formatter=format_error,
)

Lifecycle Management

MCPEnv runs MCP servers in a persistent background event loop that starts during __init__ and automatically cleans up on exit.

Server Connection

Servers connect during environment initialization (blocking)
Connection failures raise immediately
Tools are registered once servers are ready

Cleanup

Cleanup is automatic via atexit hooks:

# Servers are disconnected when:
# 1. Python process exits
# 2. Environment is garbage collected
# 3. Manually via await env.cleanup()

env = vf.MCPEnv(mcp_servers=[...])
# ... use environment ...
await env.cleanup()  # Optional: explicit cleanup

Multi-Server Example

import verifiers as vf
from verifiers.envs.experimental.mcp_env import MCPServerConfig

def load_environment():
    servers = [
        MCPServerConfig(
            name="web",
            command="uvx",
            args=["mcp-server-fetch"],
            description="Web page fetching"
        ),
        MCPServerConfig(
            name="memory",
            command="npx",
            args=["-y", "@modelcontextprotocol/server-memory"],
            description="Knowledge graph memory"
        ),
        MCPServerConfig(
            name="postgres",
            command="npx",
            args=["-y", "@modelcontextprotocol/server-postgres"],
            env={"POSTGRES_URL": "postgresql://..."},
            description="Database queries"
        ),
    ]
    
    dataset = vf.Environment.make_dataset([
        {"question": "Research topic X and store findings in memory"},
        {"question": "Query the database for recent entries"},
    ])
    
    return vf.MCPEnv(
        mcp_servers=servers,
        dataset=dataset,
        rubric=vf.Rubric(lambda **kw: 1.0),
        max_turns=15,
    )

Tool Schema Conversion

MCP tools are automatically converted to Verifiers tool format:

# MCP tool schema
{
    "name": "read_file",
    "description": "Read file contents",
    "inputSchema": {
        "type": "object",
        "properties": {
            "path": {"type": "string"}
        },
        "required": ["path"]
    }
}

# Converted to vf.Tool
Tool(
    name="read_file",
    description="Read file contents",
    parameters={
        "type": "object",
        "properties": {"path": {"type": "string"}},
        "required": ["path"]
    }
)

Debugging

Enable detailed MCP logging:

import logging

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("verifiers.envs.experimental.mcp_env")
logger.setLevel(logging.DEBUG)

env = vf.MCPEnv(mcp_servers=[...])
# Logs server connections, tool registrations, and calls

Limitations

Global servers only: Not designed for per-rollout stateful servers
Stdio only: Uses stdio transport (not SSE or other protocols)
No streaming: Tool results are returned as complete strings
Single event loop: All servers share one background event loop

When to Use

Use MCPEnv when:

You have existing MCP servers with read-only tools
Tools can be shared across all rollouts
You need multi-server tool composition

For stateful, per-rollout tools, use StatefulToolEnv instead.

Environment Classes

Rubrics & Parsers

Clients

Integration Classes

Experimental

Data Types

Utilities

MCPEnv

MCPEnv

Overview

Inheritance

Constructor

MCPServerConfig

Example Usage

Basic Setup

Using Dict Configs

Tool Discovery

Tool Call Flow

Error Handling

Lifecycle Management

Server Connection

Cleanup

Multi-Server Example

Tool Schema Conversion

Debugging

Limitations

When to Use

See Also

Build docs developers (and LLMs) love

Environment Classes

Rubrics & Parsers

Clients

Integration Classes

Experimental

Data Types

Utilities

​MCPEnv

​Overview

​Inheritance

​Constructor

​MCPServerConfig

​Example Usage

​Basic Setup

​Using Dict Configs

​Tool Discovery

​Tool Call Flow

​Error Handling

​Lifecycle Management

​Server Connection

​Cleanup

​Multi-Server Example

​Tool Schema Conversion

​Debugging

​Limitations

​When to Use

​See Also

Build docs developers (and LLMs) love

MCPEnv

Overview

Inheritance

Constructor

MCPServerConfig

Example Usage

Basic Setup

Using Dict Configs

Tool Discovery

Tool Call Flow

Error Handling

Lifecycle Management

Server Connection

Cleanup

Multi-Server Example

Tool Schema Conversion

Debugging

Limitations

When to Use

See Also