Skip to main content

Overview

GEPA integrates with MLflow to provide experiment tracking and logging capabilities. The integration automatically logs metrics, parameters, and optimization progress to your MLflow tracking server.

Setup

Install MLflow:
pip install mlflow

Basic Usage

Enable MLflow tracking by setting use_mlflow=True in your optimization call:
import gepa

result = gepa.optimize(
    seed_candidate={"system_prompt": "You are a helpful assistant."},
    trainset=trainset,
    valset=valset,
    task_lm="openai/gpt-4o-mini",
    reflection_lm="openai/gpt-4o",
    max_metric_calls=100,
    use_mlflow=True,  # Enable MLflow tracking
)

Configuration Options

The gepa.optimize() function provides several MLflow-specific parameters:

use_mlflow

  • Type: bool
  • Default: False
  • Description: Enable MLflow experiment tracking

mlflow_tracking_uri

  • Type: str | None
  • Default: None
  • Description: URI of the MLflow tracking server. If not specified, MLflow uses the default tracking URI.
result = gepa.optimize(
    ...,
    use_mlflow=True,
    mlflow_tracking_uri="http://localhost:5000",
)

mlflow_experiment_name

  • Type: str | None
  • Default: None
  • Description: Name of the MLflow experiment. If not specified, logs are saved to the default experiment.
result = gepa.optimize(
    ...,
    use_mlflow=True,
    mlflow_experiment_name="prompt-optimization-experiment",
)

Complete Example

import gepa

# Sample data
trainset = [
    {"question": "What is 2+2?", "answer": "4"},
    {"question": "What is 5*3?", "answer": "15"},
]

valset = [
    {"question": "What is 10-3?", "answer": "7"},
]

# Optimize with MLflow tracking
result = gepa.optimize(
    seed_candidate={
        "system_prompt": "You are a math tutor. Answer questions clearly."
    },
    trainset=trainset,
    valset=valset,
    task_lm="openai/gpt-4o-mini",
    reflection_lm="openai/gpt-4o",
    max_metric_calls=50,
    use_mlflow=True,
    mlflow_tracking_uri="http://localhost:5000",
    mlflow_experiment_name="math-tutor-optimization",
)

print("Best prompt:", result.best_candidate["system_prompt"])

Logged Metrics

GEPA automatically logs the following metrics to MLflow during optimization:
  • Validation scores: Performance on the validation set
  • Training scores: Performance on training minibatches
  • Iteration count: Current optimization iteration
  • Metric calls: Number of evaluations performed
  • Best score: Highest validation score achieved

Combined Logging

You can use both MLflow and Weights & Biases simultaneously:
result = gepa.optimize(
    ...,
    use_mlflow=True,
    mlflow_experiment_name="my-experiment",
    use_wandb=True,
    wandb_init_kwargs={"project": "my-project"},
)

MLflow UI

View your experiments in the MLflow UI:
mlflow ui --port 5000
Then navigate to http://localhost:5000 to explore your optimization runs, compare experiments, and analyze metrics.

External Resources

MLflow Prompt Optimization Guide

Official MLflow documentation for prompt optimization with GEPA

MLflow Tracking Documentation

Learn more about MLflow experiment tracking

Build docs developers (and LLMs) love