Skip to main content

What are GEPA Adapters?

GEPA connects to your system via the GEPAAdapter interface, enabling optimization of any system with textual parameters. Adapters bridge between GEPA’s evolutionary optimization engine and your specific task implementation.

Core Concept

Adapters implement three key responsibilities:
  1. Program Construction & Evaluation (evaluate): Execute your system with candidate parameters and return scores
  2. Reflective Dataset Construction (make_reflective_dataset): Transform execution traces into training data for improvement
  3. Optional Custom Proposal (propose_new_texts): Override default instruction proposal logic

GEPAAdapter Protocol

The adapter protocol is defined in src/gepa/core/adapter.py:58:
class GEPAAdapter(Protocol[DataInst, Trajectory, RolloutOutput]):
    def evaluate(
        self,
        batch: list[DataInst],
        candidate: dict[str, str],
        capture_traces: bool = False,
    ) -> EvaluationBatch[Trajectory, RolloutOutput]:
        ...
    
    def make_reflective_dataset(
        self,
        candidate: dict[str, str],
        eval_batch: EvaluationBatch[Trajectory, RolloutOutput],
        components_to_update: list[str],
    ) -> Mapping[str, Sequence[Mapping[str, Any]]]:
        ...
    
    propose_new_texts: ProposalFn | None = None

Type Parameters

  • DataInst: User-defined input data type (e.g., questions, documents)
  • Trajectory: Execution trace capturing program steps
  • RolloutOutput: Final output from program execution

Key Concepts

Candidate

A dict[str, str] mapping component names to their text:
candidate = {
    "system_prompt": "You are a helpful assistant...",
    "tool_description": "Read file contents from disk"
}

Scores

Higher is better. GEPA uses:
  • Minibatch: sum(scores) for acceptance decisions
  • Full validation set: mean(scores) for tracking and Pareto fronts

Trajectories

Opaque to GEPA - only consumed by your make_reflective_dataset. Should capture:
  • Inputs to each component
  • Generated outputs
  • Error messages or diagnostic info
  • Any context needed for feedback generation

Error Handling

Never raise for individual example failures:
  • Return valid EvaluationBatch with failure scores (e.g., 0.0)
  • Populate trajectories with error details when possible
  • Reserve exceptions for systemic failures (missing model, schema mismatch)

Built-in Adapters

DefaultAdapter

Single-turn LLM tasks with system prompt optimization

DSPy Adapter

Optimize DSPy module signature instructions

DSPy Full Program

Evolve entire DSPy programs including structure

Generic RAG

Vector store-agnostic RAG optimization

MCP Adapter

Optimize Model Context Protocol tool usage

TerminalBench

Optimize Terminus terminal-use agents

AnyMaths

Mathematical problem-solving optimization

Creating Custom Adapters

To create your own adapter:
  1. Define your DataInst, Trajectory, and RolloutOutput types
  2. Implement evaluate() to execute your system
  3. Implement make_reflective_dataset() to generate feedback
  4. Optionally implement propose_new_texts() for custom proposal logic

Example Skeleton

from typing import TypedDict
from gepa.core.adapter import GEPAAdapter, EvaluationBatch

class MyDataInst(TypedDict):
    input: str
    expected_output: str

class MyTrajectory(TypedDict):
    input: str
    output: str
    score: float

class MyOutput(TypedDict):
    result: str

class MyAdapter(GEPAAdapter[MyDataInst, MyTrajectory, MyOutput]):
    def evaluate(self, batch, candidate, capture_traces=False):
        # Execute your system with candidate parameters
        outputs = []
        scores = []
        trajectories = [] if capture_traces else None
        
        for item in batch:
            # Run your system
            result = my_system.run(item['input'], candidate)
            score = compute_score(result, item['expected_output'])
            
            outputs.append({'result': result})
            scores.append(score)
            
            if capture_traces:
                trajectories.append({
                    'input': item['input'],
                    'output': result,
                    'score': score
                })
        
        return EvaluationBatch(
            outputs=outputs,
            scores=scores,
            trajectories=trajectories
        )
    
    def make_reflective_dataset(self, candidate, eval_batch, components_to_update):
        reflective_data = {}
        
        for component in components_to_update:
            examples = []
            for traj, score in zip(eval_batch.trajectories, eval_batch.scores):
                feedback = generate_feedback(traj, score)
                examples.append({
                    'Inputs': traj['input'],
                    'Generated Outputs': traj['output'],
                    'Feedback': feedback
                })
            reflective_data[component] = examples
        
        return reflective_data

Best Practices

Evaluation

  • Keep capture_traces=False fast - used frequently during optimization
  • Make trajectories comprehensive when capture_traces=True
  • Return per-example scores that can be aggregated via summation
  • Use consistent failure scores (e.g., 0.0) for failed examples

Reflective Datasets

  • Keep examples concise but informative
  • Include both successful and failed cases
  • Provide actionable feedback mentioning specific issues
  • Use deterministic subsampling (seeded RNG) for reproducibility

Performance

  • Batch API calls when possible
  • Cache expensive operations (embeddings, model loading)
  • Use parallel execution for independent examples
  • Avoid unnecessary deep copies

Integration with GEPA

Use your adapter with gepa.optimize():
import gepa

result = gepa.optimize(
    seed_candidate={'system_prompt': 'Initial prompt...'},
    trainset=train_data,
    valset=val_data,
    adapter=MyAdapter(),
    max_metric_calls=150,
    reflection_lm='openai/gpt-4'
)

print(result.best_candidate)

Reference Implementation

See the DSPy adapter for a comprehensive reference implementation supporting:
  • Multiple predictors
  • Tool optimization
  • Custom proposal logic
  • Complex trace handling

Build docs developers (and LLMs) love