Overview
The MCPAdapter enables GEPA to optimize Model Context Protocol (MCP) tool usage. It supports:
- Local servers via stdio (Python, Node.js)
- Remote servers via SSE or StreamableHTTP
- Multi-tool optimization across multiple tools simultaneously
- Two-pass workflow for better tool integration
- Tool description optimization to improve model understanding
- System prompt optimization for better tool usage guidance
Installation
Quick Start
Local Server (Ollama - No Cost)
import gepa
from gepa.adapters.mcp_adapter import MCPAdapter
from mcp import StdioServerParameters
# Configure local MCP filesystem server
server_params = StdioServerParameters(
command="npx",
args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
)
# Create adapter with LOCAL Ollama model (FREE)
adapter = MCPAdapter(
server_params=server_params,
tool_names=["read_file", "write_file", "list_files"],
task_model="ollama/llama3.2:1b", # Local model
metric_fn=lambda item, output: 1.0 if item["reference_answer"] in output else 0.0
)
# Prepare dataset
dataset = [
{
"user_query": "What's in notes.txt?",
"tool_arguments": {"path": "/tmp/notes.txt"},
"reference_answer": "Meeting at 3pm",
"additional_context": {}
},
# ... more examples
]
# Optimize with local models - $0.00 cost!
result = gepa.optimize(
seed_candidate={"tool_description": "Read file contents from disk"},
trainset=dataset[:20],
valset=dataset[20:],
adapter=adapter,
reflection_lm="ollama/llama3.1:8b", # Larger local model for reflection
max_metric_calls=150
)
print("Optimized tool description:", result.best_candidate["tool_description"])
OpenAI API
adapter = MCPAdapter(
server_params=server_params,
tool_names="read_file",
task_model="openai/gpt-4o-mini", # OpenAI model
metric_fn=my_metric
)
result = gepa.optimize(
seed_candidate={"tool_description": "Read file contents"},
trainset=dataset[:20],
valset=dataset[20:],
adapter=adapter,
reflection_lm="openai/gpt-4",
max_metric_calls=150
)
Class Signature
Defined in src/gepa/adapters/mcp_adapter/mcp_adapter.py:94:
class MCPAdapter(GEPAAdapter[MCPDataInst, MCPTrajectory, MCPOutput]):
def __init__(
self,
tool_names: str | list[str],
task_model: str | Callable,
metric_fn: Callable[[MCPDataInst, str], float],
# Local server configuration
server_params: StdioServerParameters | None = None,
# Remote server configuration
remote_url: str | None = None,
remote_transport: str = "sse",
remote_headers: dict[str, str] | None = None,
remote_timeout: float = 30,
# Adapter configuration
base_system_prompt: str = "You are a helpful assistant with access to tools.",
enable_two_pass: bool = True,
failure_score: float = 0.0,
)
Parameters
Name(s) of tool(s) to optimize:
- Single tool:
"read_file"
- Multiple tools:
["read_file", "write_file", "list_files"]
Model for task execution:
- LiteLLM string:
"openai/gpt-4o-mini", "ollama/llama3.2:1b"
- Custom callable:
(messages: list[dict]) -> str
metric_fn
Callable[[MCPDataInst, str], float]
required
Scoring function:def metric(item: MCPDataInst, output: str) -> float:
return 1.0 if item["reference_answer"] in output else 0.0
server_params
StdioServerParameters | None
default:"None"
Local MCP server configuration (stdio transport):from mcp import StdioServerParameters
server_params = StdioServerParameters(
command="python",
args=["my_server.py"]
)
Required if not using remote server.
Remote MCP server URL:
- SSE:
"https://mcp-server.com/sse"
- HTTP:
"https://mcp-server.com/mcp"
Required if not using local server.
Remote transport protocol:
"sse": Server-Sent Events (streaming)
"streamable_http": HTTP with session management
HTTP headers for remote servers:remote_headers = {
"Authorization": "Bearer YOUR_TOKEN",
"X-API-Key": "your-key"
}
Timeout for remote HTTP operations (seconds).
Base system prompt template. Can be optimized if included in seed_candidate.
Use two-pass workflow:
- Pass 1: Model calls tool
- Pass 2: Model uses tool response to generate final answer
Disable for simpler single-pass evaluation.
Score assigned when evaluation fails (tool errors, parsing errors).
Data Types
MCPDataInst
Input data structure (src/gepa/adapters/mcp_adapter/mcp_adapter.py:34):
class MCPDataInst(TypedDict):
user_query: str # User's question or request
tool_arguments: dict[str, Any] # Expected tool arguments
reference_answer: str | None # Reference answer for scoring
additional_context: dict[str, str] # Optional context
MCPTrajectory
Execution trace (src/gepa/adapters/mcp_adapter/mcp_adapter.py:51):
class MCPTrajectory(TypedDict):
user_query: str # Original query
tool_names: list[str] # Available tools
selected_tool: str | None # Tool selected by model
tool_called: bool # Whether tool was called
tool_arguments: dict | None # Arguments passed to tool
tool_response: str | None # Tool's response
tool_description_used: str # Tool description
system_prompt_used: str # System prompt
model_first_pass_output: str # Model's first response
model_final_output: str # Model's final answer
score: float # Evaluation score
MCPOutput
Final output (src/gepa/adapters/mcp_adapter/mcp_adapter.py:72):
class MCPOutput(TypedDict):
final_answer: str # Final answer from model
tool_called: bool # Whether tool was called
selected_tool: str | None # Which tool was selected
tool_response: str | None # Tool's response
adapter = MCPAdapter(
tool_names="read_file", # Single tool
task_model="openai/gpt-4o-mini",
metric_fn=my_metric,
server_params=server_params
)
seed_candidate = {
"tool_description": "Read file contents from disk"
}
adapter = MCPAdapter(
tool_names=["read_file", "write_file", "list_files"], # Multiple tools
task_model="openai/gpt-4o-mini",
metric_fn=my_metric,
server_params=server_params
)
seed_candidate = {
"tool_description_read_file": "Read file contents",
"tool_description_write_file": "Write content to file",
"tool_description_list_files": "List files in directory"
}
# GEPA will optimize each tool description independently
Optimizable Components
Optimizes the tool’s description field:
seed_candidate = {
"tool_description": "Read file contents"
}
# GEPA might evolve this to:
# "Read file contents from the filesystem. Use when user asks to view,
# show, or display file contents. Returns the full text content of the
# specified file path. Requires 'path' parameter with absolute or relative
# file path."
System Prompt
Optimizes guidance for tool usage:
seed_candidate = {
"tool_description": "Read file contents",
"system_prompt": "You are a helpful assistant with file access."
}
# GEPA optimizes both jointly
Two-Pass Workflow
The adapter uses a sophisticated two-pass approach:
- Model receives user query and tool information
- Model decides whether to call tool
- If yes: Model outputs JSON with tool name and arguments
- Tool is executed, response captured
Pass 2: Final Answer Generation
- Model receives original query + tool response
- Model generates final answer incorporating tool results
- Final answer is evaluated against reference
Model should respond with:
Call tool:
{"action": "call_tool", "tool": "read_file", "arguments": {"path": "/tmp/notes.txt"}}
Direct answer:
{"action": "answer", "text": "The answer is..."}
Disabling Two-Pass
For simpler single-pass evaluation:
adapter = MCPAdapter(
tool_names="my_tool",
task_model="openai/gpt-4o-mini",
metric_fn=my_metric,
server_params=server_params,
enable_two_pass=False # Single pass only
)
Methods
evaluate()
Evaluates candidate on batch using MCP tools.
def evaluate(
self,
batch: list[MCPDataInst],
candidate: dict[str, str],
capture_traces: bool = False,
) -> EvaluationBatch[MCPTrajectory, MCPOutput]
Implementation: src/gepa/adapters/mcp_adapter/mcp_adapter.py:187
Behavior
- Creates MCP client session (local or remote)
- Retrieves available tools from server
- For each example:
- First pass: Model calls tool (if needed)
- Second pass: Model generates final answer (if two-pass enabled)
- Scores outputs using
metric_fn
- Captures trajectories if
capture_traces=True
- Closes MCP session
make_reflective_dataset()
Builds reflective dataset for instruction refinement.
def make_reflective_dataset(
self,
candidate: dict[str, str],
eval_batch: EvaluationBatch[MCPTrajectory, MCPOutput],
components_to_update: list[str],
) -> dict[str, list[dict[str, Any]]]
Implementation: src/gepa/adapters/mcp_adapter/mcp_adapter.py:602
Returns
{
"tool_description": [
{
"Inputs": {
"user_query": "What's in config.json?",
"tool_description": "Read file contents"
},
"Generated Outputs": {
"tool_called": True,
"selected_tool": "read_file",
"tool_arguments": {"path": "config.json"},
"final_answer": "The config file contains..."
},
"Feedback": "Good! Tool was used appropriately. Score: 0.85"
},
# ... more examples
]
}
Local Server Examples
Filesystem Server (Node.js)
from mcp import StdioServerParameters
server_params = StdioServerParameters(
command="npx",
args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
)
adapter = MCPAdapter(
server_params=server_params,
tool_names=["read_file", "write_file", "list_files"],
task_model="ollama/llama3.2:1b",
metric_fn=my_metric
)
Custom Python Server
Create my_server.py:
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("MyServer")
@mcp.tool()
def search_docs(query: str) -> str:
"""Search documentation."""
# Your search logic
return f"Results for: {query}"
if __name__ == "__main__":
mcp.run()
Use in GEPA:
server_params = StdioServerParameters(
command="python",
args=["my_server.py"]
)
adapter = MCPAdapter(
server_params=server_params,
tool_names="search_docs",
task_model="openai/gpt-4o-mini",
metric_fn=my_metric
)
Remote Server Examples
Public SSE Server
adapter = MCPAdapter(
tool_names="search_web",
task_model="openai/gpt-4o-mini",
metric_fn=my_metric,
remote_url="https://public-mcp.example.com/sse",
remote_transport="sse"
)
Authenticated HTTP Server
adapter = MCPAdapter(
tool_names=["analyze_data", "visualize_data"],
task_model="openai/gpt-4o-mini",
metric_fn=my_metric,
remote_url="https://internal-mcp.company.com/mcp",
remote_transport="streamable_http",
remote_headers={
"Authorization": "Bearer YOUR_API_TOKEN",
"X-Custom-Header": "value"
},
remote_timeout=60
)
Custom Metric Functions
Exact Match
def exact_match(item: MCPDataInst, output: str) -> float:
return 1.0 if item["reference_answer"] in output else 0.0
Fuzzy Matching
from difflib import SequenceMatcher
def fuzzy_match(item: MCPDataInst, output: str) -> float:
ratio = SequenceMatcher(
None,
item["reference_answer"],
output
).ratio()
return ratio # 0.0 to 1.0
LLM-as-Judge
import litellm
def llm_judge(item: MCPDataInst, output: str) -> float:
messages = [{
"role": "user",
"content": f"Rate this answer (0-1):\n"
f"Question: {item['user_query']}\n"
f"Reference: {item['reference_answer']}\n"
f"Answer: {output}"
}]
response = litellm.completion(
model="openai/gpt-4o",
messages=messages
)
return float(response.choices[0].message.content)
Complete Example
import gepa
from gepa.adapters.mcp_adapter import MCPAdapter
from mcp import StdioServerParameters
# 1. Configure MCP server
server_params = StdioServerParameters(
command="npx",
args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
)
# 2. Prepare dataset
dataset = [
{
"user_query": "What's in notes.txt?",
"tool_arguments": {"path": "/tmp/notes.txt"},
"reference_answer": "Meeting at 3pm",
"additional_context": {}
},
{
"user_query": "List files in the config directory",
"tool_arguments": {"path": "/tmp/config"},
"reference_answer": "config.json",
"additional_context": {}
},
# ... 30+ examples
]
# 3. Create adapter
adapter = MCPAdapter(
server_params=server_params,
tool_names=["read_file", "write_file", "list_files"],
task_model="ollama/llama3.2:1b", # FREE local model
metric_fn=lambda item, output: 1.0 if item["reference_answer"] in output else 0.0,
enable_two_pass=True
)
# 4. Optimize
result = gepa.optimize(
seed_candidate={
"tool_description_read_file": "Read file contents",
"tool_description_write_file": "Write to file",
"tool_description_list_files": "List files"
},
trainset=dataset[:20],
valset=dataset[20:],
adapter=adapter,
reflection_lm="ollama/llama3.1:8b", # FREE local reflection
max_metric_calls=150
)
# 5. Review results
print("\nOptimized Tool Descriptions:")
for tool in ["read_file", "write_file", "list_files"]:
key = f"tool_description_{tool}"
print(f"\n{tool}:")
print(result.best_candidate[key])
print(f"\nValidation Score: {result.best_score:.2f}")
Best Practices
- Dataset Quality: Provide 20+ examples covering different tool usage scenarios
- Tool Names: Use descriptive tool names that hint at functionality
- Reference Answers: Include key information expected in final answers
- Multi-Tool: When optimizing multiple tools, ensure examples use different tools
- Local Development: Use Ollama for free local development
- Production: Use OpenAI/Anthropic for production workloads
Subprocess Overhead
Each evaluate() call spawns a new MCP server process:
- Startup time: ~100-500ms per evaluation
- Total overhead for 150 evaluations: ~15-75 seconds
This is expected behavior in the current implementation. Future improvements planned:
- Session pooling (reuse processes)
- Background event loop (persistent session)
- Async GEPA core (native async support)
Cost Optimization
Free (Ollama):
# Total cost: $0.00
adapter = MCPAdapter(
tool_names="my_tool",
task_model="ollama/llama3.2:1b", # Local
...
)
result = gepa.optimize(
...,
reflection_lm="ollama/llama3.1:8b" # Local
)
Low Cost (OpenAI):
# ~$0.50 for 150 evaluations
adapter = MCPAdapter(
tool_names="my_tool",
task_model="openai/gpt-4o-mini", # $0.15/1M tokens
...
)
result = gepa.optimize(
...,
reflection_lm="openai/gpt-4" # For proposal only
)
Troubleshooting
# Error: Tools ['my_tool'] not found
# Solution: Check tool name matches server
# List available tools
import asyncio
from gepa.adapters.mcp_adapter.mcp_client import create_mcp_client
async def list_tools():
client = create_mcp_client(server_params=server_params)
await client.start()
await client.initialize()
tools = await client.list_tools()
print([t['name'] for t in tools])
await client.close()
asyncio.run(list_tools())
Check system prompt includes clear instructions:
adapter = MCPAdapter(
tool_names="read_file",
task_model="openai/gpt-4o-mini",
metric_fn=my_metric,
server_params=server_params,
base_system_prompt="You are a helpful assistant. When asked about file contents, use the read_file tool."
)
JSON Parsing Errors
Model might not follow JSON format. Add to seed candidate:
seed_candidate = {
"tool_description": "Read file contents",
"system_prompt": "Always respond with valid JSON. No other text."
}
See Also