MCP Adapter

Overview

The MCPAdapter enables GEPA to optimize Model Context Protocol (MCP) tool usage. It supports:

Local servers via stdio (Python, Node.js)
Remote servers via SSE or StreamableHTTP
Multi-tool optimization across multiple tools simultaneously
Two-pass workflow for better tool integration
Tool description optimization to improve model understanding
System prompt optimization for better tool usage guidance

Installation

pip install gepa mcp

Quick Start

Local Server (Ollama - No Cost)

import gepa
from gepa.adapters.mcp_adapter import MCPAdapter
from mcp import StdioServerParameters

# Configure local MCP filesystem server
server_params = StdioServerParameters(
    command="npx",
    args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
)

# Create adapter with LOCAL Ollama model (FREE)
adapter = MCPAdapter(
    server_params=server_params,
    tool_names=["read_file", "write_file", "list_files"],
    task_model="ollama/llama3.2:1b",  # Local model
    metric_fn=lambda item, output: 1.0 if item["reference_answer"] in output else 0.0
)

# Prepare dataset
dataset = [
    {
        "user_query": "What's in notes.txt?",
        "tool_arguments": {"path": "/tmp/notes.txt"},
        "reference_answer": "Meeting at 3pm",
        "additional_context": {}
    },
    # ... more examples
]

# Optimize with local models - $0.00 cost!
result = gepa.optimize(
    seed_candidate={"tool_description": "Read file contents from disk"},
    trainset=dataset[:20],
    valset=dataset[20:],
    adapter=adapter,
    reflection_lm="ollama/llama3.1:8b",  # Larger local model for reflection
    max_metric_calls=150
)

print("Optimized tool description:", result.best_candidate["tool_description"])

OpenAI API

adapter = MCPAdapter(
    server_params=server_params,
    tool_names="read_file",
    task_model="openai/gpt-4o-mini",  # OpenAI model
    metric_fn=my_metric
)

result = gepa.optimize(
    seed_candidate={"tool_description": "Read file contents"},
    trainset=dataset[:20],
    valset=dataset[20:],
    adapter=adapter,
    reflection_lm="openai/gpt-4",
    max_metric_calls=150
)

Class Signature

Defined in src/gepa/adapters/mcp_adapter/mcp_adapter.py:94:

class MCPAdapter(GEPAAdapter[MCPDataInst, MCPTrajectory, MCPOutput]):
    def __init__(
        self,
        tool_names: str | list[str],
        task_model: str | Callable,
        metric_fn: Callable[[MCPDataInst, str], float],
        # Local server configuration
        server_params: StdioServerParameters | None = None,
        # Remote server configuration
        remote_url: str | None = None,
        remote_transport: str = "sse",
        remote_headers: dict[str, str] | None = None,
        remote_timeout: float = 30,
        # Adapter configuration
        base_system_prompt: str = "You are a helpful assistant with access to tools.",
        enable_two_pass: bool = True,
        failure_score: float = 0.0,
    )

Parameters

tool_names

str | list[str]

required

Name(s) of tool(s) to optimize:

Single tool: "read_file"
Multiple tools: ["read_file", "write_file", "list_files"]

task_model

str | Callable

required

Model for task execution:

LiteLLM string: "openai/gpt-4o-mini", "ollama/llama3.2:1b"
Custom callable: (messages: list[dict]) -> str

metric_fn

Callable[[MCPDataInst, str], float]

required

Scoring function:

def metric(item: MCPDataInst, output: str) -> float:
    return 1.0 if item["reference_answer"] in output else 0.0

server_params

StdioServerParameters | None

default:"None"

Local MCP server configuration (stdio transport):

from mcp import StdioServerParameters

server_params = StdioServerParameters(
    command="python",
    args=["my_server.py"]
)

Required if not using remote server.

remote_url

str | None

default:"None"

Remote MCP server URL:

SSE: "https://mcp-server.com/sse"
HTTP: "https://mcp-server.com/mcp"

Required if not using local server.

remote_transport

str

default:"'sse'"

Remote transport protocol:

"sse": Server-Sent Events (streaming)
"streamable_http": HTTP with session management

remote_headers

dict[str, str] | None

default:"None"

HTTP headers for remote servers:

remote_headers = {
    "Authorization": "Bearer YOUR_TOKEN",
    "X-API-Key": "your-key"
}

remote_timeout

float

default:"30"

Timeout for remote HTTP operations (seconds).

base_system_prompt

str

Base system prompt template. Can be optimized if included in seed_candidate.

enable_two_pass

bool

default:"True"

Use two-pass workflow:

Pass 1: Model calls tool
Pass 2: Model uses tool response to generate final answer

Disable for simpler single-pass evaluation.

failure_score

float

default:"0.0"

Score assigned when evaluation fails (tool errors, parsing errors).

Data Types

MCPDataInst

Input data structure (src/gepa/adapters/mcp_adapter/mcp_adapter.py:34):

class MCPDataInst(TypedDict):
    user_query: str                 # User's question or request
    tool_arguments: dict[str, Any]  # Expected tool arguments
    reference_answer: str | None    # Reference answer for scoring
    additional_context: dict[str, str]  # Optional context

MCPTrajectory

Execution trace (src/gepa/adapters/mcp_adapter/mcp_adapter.py:51):

class MCPTrajectory(TypedDict):
    user_query: str                 # Original query
    tool_names: list[str]           # Available tools
    selected_tool: str | None       # Tool selected by model
    tool_called: bool               # Whether tool was called
    tool_arguments: dict | None     # Arguments passed to tool
    tool_response: str | None       # Tool's response
    tool_description_used: str      # Tool description
    system_prompt_used: str         # System prompt
    model_first_pass_output: str    # Model's first response
    model_final_output: str         # Model's final answer
    score: float                    # Evaluation score

MCPOutput

Final output (src/gepa/adapters/mcp_adapter/mcp_adapter.py:72):

class MCPOutput(TypedDict):
    final_answer: str               # Final answer from model
    tool_called: bool               # Whether tool was called
    selected_tool: str | None       # Which tool was selected
    tool_response: str | None       # Tool's response

Multi-Tool Support

Single Tool

adapter = MCPAdapter(
    tool_names="read_file",  # Single tool
    task_model="openai/gpt-4o-mini",
    metric_fn=my_metric,
    server_params=server_params
)

seed_candidate = {
    "tool_description": "Read file contents from disk"
}

Multiple Tools

adapter = MCPAdapter(
    tool_names=["read_file", "write_file", "list_files"],  # Multiple tools
    task_model="openai/gpt-4o-mini",
    metric_fn=my_metric,
    server_params=server_params
)

seed_candidate = {
    "tool_description_read_file": "Read file contents",
    "tool_description_write_file": "Write content to file",
    "tool_description_list_files": "List files in directory"
}

# GEPA will optimize each tool description independently

Optimizable Components

Tool Description

Optimizes the tool’s description field:

seed_candidate = {
    "tool_description": "Read file contents"
}

# GEPA might evolve this to:
# "Read file contents from the filesystem. Use when user asks to view,
#  show, or display file contents. Returns the full text content of the
#  specified file path. Requires 'path' parameter with absolute or relative
#  file path."

System Prompt

Optimizes guidance for tool usage:

seed_candidate = {
    "tool_description": "Read file contents",
    "system_prompt": "You are a helpful assistant with file access."
}

# GEPA optimizes both jointly

Two-Pass Workflow

The adapter uses a sophisticated two-pass approach:

Pass 1: Tool Call Decision

Model receives user query and tool information
Model decides whether to call tool
If yes: Model outputs JSON with tool name and arguments
Tool is executed, response captured

Pass 2: Final Answer Generation

Model receives original query + tool response
Model generates final answer incorporating tool results
Final answer is evaluated against reference

Expected JSON Format

Model should respond with: Call tool:

{"action": "call_tool", "tool": "read_file", "arguments": {"path": "/tmp/notes.txt"}}

Direct answer:

{"action": "answer", "text": "The answer is..."}

Disabling Two-Pass

For simpler single-pass evaluation:

adapter = MCPAdapter(
    tool_names="my_tool",
    task_model="openai/gpt-4o-mini",
    metric_fn=my_metric,
    server_params=server_params,
    enable_two_pass=False  # Single pass only
)

Methods

evaluate()

Evaluates candidate on batch using MCP tools.

def evaluate(
    self,
    batch: list[MCPDataInst],
    candidate: dict[str, str],
    capture_traces: bool = False,
) -> EvaluationBatch[MCPTrajectory, MCPOutput]

Implementation: src/gepa/adapters/mcp_adapter/mcp_adapter.py:187

Behavior

Creates MCP client session (local or remote)
Retrieves available tools from server
For each example:
- First pass: Model calls tool (if needed)
- Second pass: Model generates final answer (if two-pass enabled)
Scores outputs using metric_fn
Captures trajectories if capture_traces=True
Closes MCP session

make_reflective_dataset()

Builds reflective dataset for instruction refinement.

def make_reflective_dataset(
    self,
    candidate: dict[str, str],
    eval_batch: EvaluationBatch[MCPTrajectory, MCPOutput],
    components_to_update: list[str],
) -> dict[str, list[dict[str, Any]]]

Implementation: src/gepa/adapters/mcp_adapter/mcp_adapter.py:602

Returns

{
    "tool_description": [
        {
            "Inputs": {
                "user_query": "What's in config.json?",
                "tool_description": "Read file contents"
            },
            "Generated Outputs": {
                "tool_called": True,
                "selected_tool": "read_file",
                "tool_arguments": {"path": "config.json"},
                "final_answer": "The config file contains..."
            },
            "Feedback": "Good! Tool was used appropriately. Score: 0.85"
        },
        # ... more examples
    ]
}

Local Server Examples

Filesystem Server (Node.js)

from mcp import StdioServerParameters

server_params = StdioServerParameters(
    command="npx",
    args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
)

adapter = MCPAdapter(
    server_params=server_params,
    tool_names=["read_file", "write_file", "list_files"],
    task_model="ollama/llama3.2:1b",
    metric_fn=my_metric
)

Custom Python Server

Create my_server.py:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("MyServer")

@mcp.tool()
def search_docs(query: str) -> str:
    """Search documentation."""
    # Your search logic
    return f"Results for: {query}"

if __name__ == "__main__":
    mcp.run()

Use in GEPA:

server_params = StdioServerParameters(
    command="python",
    args=["my_server.py"]
)

adapter = MCPAdapter(
    server_params=server_params,
    tool_names="search_docs",
    task_model="openai/gpt-4o-mini",
    metric_fn=my_metric
)

Remote Server Examples

Public SSE Server

adapter = MCPAdapter(
    tool_names="search_web",
    task_model="openai/gpt-4o-mini",
    metric_fn=my_metric,
    remote_url="https://public-mcp.example.com/sse",
    remote_transport="sse"
)

Authenticated HTTP Server

adapter = MCPAdapter(
    tool_names=["analyze_data", "visualize_data"],
    task_model="openai/gpt-4o-mini",
    metric_fn=my_metric,
    remote_url="https://internal-mcp.company.com/mcp",
    remote_transport="streamable_http",
    remote_headers={
        "Authorization": "Bearer YOUR_API_TOKEN",
        "X-Custom-Header": "value"
    },
    remote_timeout=60
)

Custom Metric Functions

Exact Match

def exact_match(item: MCPDataInst, output: str) -> float:
    return 1.0 if item["reference_answer"] in output else 0.0

Fuzzy Matching

from difflib import SequenceMatcher

def fuzzy_match(item: MCPDataInst, output: str) -> float:
    ratio = SequenceMatcher(
        None,
        item["reference_answer"],
        output
    ).ratio()
    return ratio  # 0.0 to 1.0

LLM-as-Judge

import litellm

def llm_judge(item: MCPDataInst, output: str) -> float:
    messages = [{
        "role": "user",
        "content": f"Rate this answer (0-1):\n"
                   f"Question: {item['user_query']}\n"
                   f"Reference: {item['reference_answer']}\n"
                   f"Answer: {output}"
    }]
    response = litellm.completion(
        model="openai/gpt-4o",
        messages=messages
    )
    return float(response.choices[0].message.content)

Complete Example

import gepa
from gepa.adapters.mcp_adapter import MCPAdapter
from mcp import StdioServerParameters

# 1. Configure MCP server
server_params = StdioServerParameters(
    command="npx",
    args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
)

# 2. Prepare dataset
dataset = [
    {
        "user_query": "What's in notes.txt?",
        "tool_arguments": {"path": "/tmp/notes.txt"},
        "reference_answer": "Meeting at 3pm",
        "additional_context": {}
    },
    {
        "user_query": "List files in the config directory",
        "tool_arguments": {"path": "/tmp/config"},
        "reference_answer": "config.json",
        "additional_context": {}
    },
    # ... 30+ examples
]

# 3. Create adapter
adapter = MCPAdapter(
    server_params=server_params,
    tool_names=["read_file", "write_file", "list_files"],
    task_model="ollama/llama3.2:1b",  # FREE local model
    metric_fn=lambda item, output: 1.0 if item["reference_answer"] in output else 0.0,
    enable_two_pass=True
)

# 4. Optimize
result = gepa.optimize(
    seed_candidate={
        "tool_description_read_file": "Read file contents",
        "tool_description_write_file": "Write to file",
        "tool_description_list_files": "List files"
    },
    trainset=dataset[:20],
    valset=dataset[20:],
    adapter=adapter,
    reflection_lm="ollama/llama3.1:8b",  # FREE local reflection
    max_metric_calls=150
)

# 5. Review results
print("\nOptimized Tool Descriptions:")
for tool in ["read_file", "write_file", "list_files"]:
    key = f"tool_description_{tool}"
    print(f"\n{tool}:")
    print(result.best_candidate[key])

print(f"\nValidation Score: {result.best_score:.2f}")

Best Practices

Dataset Quality: Provide 20+ examples covering different tool usage scenarios
Tool Names: Use descriptive tool names that hint at functionality
Reference Answers: Include key information expected in final answers
Multi-Tool: When optimizing multiple tools, ensure examples use different tools
Local Development: Use Ollama for free local development
Production: Use OpenAI/Anthropic for production workloads

Performance Notes

Subprocess Overhead

Each evaluate() call spawns a new MCP server process:

Startup time: ~100-500ms per evaluation
Total overhead for 150 evaluations: ~15-75 seconds

This is expected behavior in the current implementation. Future improvements planned:

Session pooling (reuse processes)
Background event loop (persistent session)
Async GEPA core (native async support)

Cost Optimization

Free (Ollama):

# Total cost: $0.00
adapter = MCPAdapter(
    tool_names="my_tool",
    task_model="ollama/llama3.2:1b",  # Local
    ...
)
result = gepa.optimize(
    ...,
    reflection_lm="ollama/llama3.1:8b"  # Local
)

Low Cost (OpenAI):

# ~$0.50 for 150 evaluations
adapter = MCPAdapter(
    tool_names="my_tool",
    task_model="openai/gpt-4o-mini",  # $0.15/1M tokens
    ...
)
result = gepa.optimize(
    ...,
    reflection_lm="openai/gpt-4"  # For proposal only
)

Troubleshooting

Tool Not Found

# Error: Tools ['my_tool'] not found
# Solution: Check tool name matches server

# List available tools
import asyncio
from gepa.adapters.mcp_adapter.mcp_client import create_mcp_client

async def list_tools():
    client = create_mcp_client(server_params=server_params)
    await client.start()
    await client.initialize()
    tools = await client.list_tools()
    print([t['name'] for t in tools])
    await client.close()

asyncio.run(list_tools())

Model Not Calling Tool

Check system prompt includes clear instructions:

adapter = MCPAdapter(
    tool_names="read_file",
    task_model="openai/gpt-4o-mini",
    metric_fn=my_metric,
    server_params=server_params,
    base_system_prompt="You are a helpful assistant. When asked about file contents, use the read_file tool."
)

JSON Parsing Errors

Model might not follow JSON format. Add to seed candidate:

seed_candidate = {
    "tool_description": "Read file contents",
    "system_prompt": "Always respond with valid JSON. No other text."
}

Core API

Adapters

Configuration

Advanced

​Overview

​Installation

​Quick Start

​Local Server (Ollama - No Cost)

​OpenAI API

​Class Signature

​Parameters

​Data Types

​MCPDataInst

​MCPTrajectory

​MCPOutput

​Multi-Tool Support

​Single Tool

​Multiple Tools

​Optimizable Components

​Tool Description

​System Prompt

​Two-Pass Workflow

​Pass 1: Tool Call Decision

​Pass 2: Final Answer Generation

​Expected JSON Format

​Disabling Two-Pass

​Methods

​evaluate()

​Behavior

​make_reflective_dataset()

​Returns

​Local Server Examples

​Filesystem Server (Node.js)

​Custom Python Server

​Remote Server Examples

​Public SSE Server

​Authenticated HTTP Server

​Custom Metric Functions

​Exact Match

​Fuzzy Matching

​LLM-as-Judge

​Complete Example

​Best Practices

​Performance Notes

​Subprocess Overhead

​Cost Optimization

​Troubleshooting

​Tool Not Found

​Model Not Calling Tool

​JSON Parsing Errors

​See Also

Build docs developers (and LLMs) love

Overview

Installation

Quick Start

Local Server (Ollama - No Cost)

OpenAI API

Class Signature

Parameters

Data Types

MCPDataInst

MCPTrajectory

MCPOutput

Multi-Tool Support

Single Tool

Multiple Tools

Optimizable Components

Tool Description

System Prompt

Two-Pass Workflow

Pass 1: Tool Call Decision

Pass 2: Final Answer Generation

Expected JSON Format

Disabling Two-Pass

Methods

evaluate()

Behavior

make_reflective_dataset()

Returns

Local Server Examples

Filesystem Server (Node.js)

Custom Python Server

Remote Server Examples

Public SSE Server

Authenticated HTTP Server

Custom Metric Functions

Exact Match

Fuzzy Matching

LLM-as-Judge

Complete Example

Best Practices

Performance Notes

Subprocess Overhead

Cost Optimization

Troubleshooting

Tool Not Found

Model Not Calling Tool

JSON Parsing Errors

See Also