Logging

Overview

GEPA provides flexible logging capabilities for tracking optimization runs, including:

File-based logging with the Logger class
Experiment tracking with W&B and MLflow via ExperimentTracker
Detailed metrics logging utilities

File Logging

Logger Class

The Logger class captures stdout and stderr to log files during optimization.

from gepa.logging.logger import Logger

with Logger("run_log.txt") as logger:
    logger.log("Starting optimization...")
    # Your optimization code here

filename

str

required

Path to the log file. A separate stderr log will be created automatically.

mode

str

default:"a"

File open mode (‘a’ for append, ‘w’ for write)

Logger Methods

log

Logs a message to the file and optionally to stdout.

logger.log("Iteration 1 complete", "Score:", 0.85)

The log method accepts the same arguments as Python’s print() function.

Context Manager

The Logger class works as a context manager, automatically redirecting stdout and stderr:

with Logger("optimization.log") as logger:
    print("This will be logged")  # Goes to both console and file
    logger.log("Explicit log message")

When used as a context manager:

stdout is redirected to run_log.txt
stderr is redirected to run_log_stderr.txt
Both remain visible in the terminal via the Tee class

StdOutLogger

For simple console-only logging:

from gepa.logging.logger import StdOutLogger

logger = StdOutLogger()
logger.log("Message to stdout")

Experiment Tracking

ExperimentTracker Class

The ExperimentTracker provides unified experiment tracking supporting both W&B and MLflow.

from gepa.logging.experiment_tracker import ExperimentTracker

tracker = ExperimentTracker(
    use_wandb=True,
    wandb_api_key="your-api-key",
    wandb_init_kwargs={"project": "my-project", "name": "run-1"},
)

with tracker:
    tracker.log_metrics({"score": 0.95, "iteration": 10}, step=10)

Constructor Parameters

use_wandb

bool

default:"false"

Enable Weights & Biases tracking

wandb_api_key

str | None

default:"None"

W&B API key (if not set, uses environment or prompts login)

wandb_init_kwargs

dict[str, Any] | None

default:"None"

Additional arguments passed to wandb.init()

use_mlflow

bool

default:"false"

Enable MLflow tracking

mlflow_tracking_uri

str | None

default:"None"

MLflow tracking server URI

mlflow_experiment_name

str | None

default:"None"

MLflow experiment name

Methods

initialize

Initializes the logging backends.

tracker.initialize()

Automatically called when using as a context manager.

start_run

Starts a new tracking run.

tracker.start_run()

log_metrics

Logs metrics to the active backends.

tracker.log_metrics(
    {"train_score": 0.85, "val_score": 0.82},
    step=5
)

metrics

dict[str, Any]

required

Dictionary of metric names and values

step

int | None

default:"None"

Optional step number for the metrics

For MLflow, only numeric values (int/float) are logged. Non-numeric values are filtered out automatically.

end_run

Ends the current tracking run.

tracker.end_run()

Automatically called when exiting context manager.

is_active

Checks if any backend has an active run.

if tracker.is_active():
    tracker.log_metrics({"status": "running"})

Using with optimize()

GEPA’s optimize() function has built-in experiment tracking support:

from gepa import optimize

result = optimize(
    seed_candidate={"instructions": "..."},
    trainset=train_data,
    valset=val_data,
    # W&B tracking
    use_wandb=True,
    wandb_api_key="your-key",
    wandb_init_kwargs={
        "project": "gepa-optimization",
        "name": "experiment-1",
        "tags": ["math", "prompt-opt"],
    },
    # MLflow tracking
    use_mlflow=True,
    mlflow_tracking_uri="http://localhost:5000",
    mlflow_experiment_name="prompt-optimization",
)

create_experiment_tracker

Factory function for creating experiment trackers:

from gepa.logging.experiment_tracker import create_experiment_tracker

tracker = create_experiment_tracker(
    use_wandb=True,
    wandb_init_kwargs={"project": "my-project"},
    use_mlflow=True,
    mlflow_tracking_uri="sqlite:///mlflow.db",
)

Detailed Metrics Logging

log_detailed_metrics_after_discovering_new_program

Utility function for logging comprehensive metrics when a new candidate is discovered.

from gepa.logging.utils import log_detailed_metrics_after_discovering_new_program

log_detailed_metrics_after_discovering_new_program(
    logger=logger,
    gepa_state=state,
    new_program_idx=5,
    valset_evaluation=val_eval,
    objective_scores=scores,
    experiment_tracker=tracker,
    linear_pareto_front_program_idx=3,
    valset_size=100,
    val_evaluation_policy=eval_policy,
    log_individual_valset_scores_and_programs=True,
)

This function logs:

Validation set scores and coverage
Pareto front information
Best program metrics
Individual validation scores (if enabled)
Multi-objective scores (if available)

logger

LoggerProtocol

required

Logger instance for output

gepa_state

GEPAState

required

Current optimization state

new_program_idx

int

required

Index of the newly discovered program

valset_evaluation

ValsetEvaluation

required

Validation set evaluation results

objective_scores

dict

required

Objective scores for the program

experiment_tracker

ExperimentTracker

required

Experiment tracker for metrics logging

linear_pareto_front_program_idx

int

required

Index of the linear Pareto front program

valset_size

int

required

Total validation set size

val_evaluation_policy

EvaluationPolicy

required

Validation evaluation policy

log_individual_valset_scores_and_programs

bool

default:"false"

Whether to log individual scores per validation example

Logged Metrics

The function logs the following metrics to the experiment tracker:

iteration: Current iteration number
new_program_idx: Index of new program
valset_pareto_front_agg: Pareto front aggregate score
valset_pareto_front_programs: Programs on Pareto front
best_valset_agg_score: Best aggregate validation score
linear_pareto_front_program_idx: Linear Pareto front index
best_program_as_per_agg_score_valset: Best program index
best_score_on_valset: Best validation score
val_evaluated_count_new_program: Validation examples evaluated
val_total_count: Total validation set size
val_program_average: Average validation score
total_metric_calls: Total metric evaluations
objective_scores_new_program: Multi-objective scores (if available)
objective_pareto_front_scores: Objective Pareto front (if available)

Example: Complete Logging Setup

from gepa import optimize
from gepa.logging.logger import Logger

with Logger("optimization_run.log") as logger:
    logger.log("Starting GEPA optimization")
    
    result = optimize(
        seed_candidate={"instructions": "Solve the problem step by step."},
        trainset=train_data,
        valset=val_data,
        logger=logger,
        run_dir="./runs/experiment-1",
        # W&B tracking
        use_wandb=True,
        wandb_init_kwargs={
            "project": "gepa-math",
            "name": "gsm8k-optimization",
            "config": {
                "dataset": "gsm8k",
                "train_size": 100,
                "val_size": 50,
            },
        },
        # MLflow tracking
        use_mlflow=True,
        mlflow_experiment_name="math-optimization",
    )
    
    logger.log(f"Best score: {result.best_score}")
    logger.log(f"Total iterations: {result.total_iterations}")

Multi-Backend Tracking

You can use both W&B and MLflow simultaneously:

tracker = ExperimentTracker(
    use_wandb=True,
    wandb_init_kwargs={"project": "my-project"},
    use_mlflow=True,
    mlflow_tracking_uri="http://localhost:5000",
    mlflow_experiment_name="my-experiment",
)

with tracker:
    # Metrics will be logged to both W&B and MLflow
    tracker.log_metrics({"score": 0.95}, step=10)

Source Reference

The logging system is implemented in:

Core API

Adapters

Configuration

Advanced

Overview

File Logging

Logger Class

Logger Methods

log

Context Manager

StdOutLogger

Experiment Tracking

ExperimentTracker Class

Constructor Parameters

Methods

initialize

start_run

log_metrics

end_run

is_active

Using with optimize()

create_experiment_tracker

Detailed Metrics Logging

log_detailed_metrics_after_discovering_new_program

Logged Metrics

Example: Complete Logging Setup

Multi-Backend Tracking

Source Reference

Build docs developers (and LLMs) love

Core API

Adapters

Configuration

Advanced

​Overview

​File Logging

​Logger Class

​Logger Methods

​log

​Context Manager

​StdOutLogger

​Experiment Tracking

​ExperimentTracker Class

​Constructor Parameters

​Methods

​initialize

​start_run

​log_metrics

​end_run

​is_active

​Using with optimize()

​create_experiment_tracker

​Detailed Metrics Logging

​log_detailed_metrics_after_discovering_new_program

​Logged Metrics

​Example: Complete Logging Setup

​Multi-Backend Tracking

​Source Reference

Build docs developers (and LLMs) love

Overview

File Logging

Logger Class

Logger Methods

log

Context Manager

StdOutLogger

Experiment Tracking

ExperimentTracker Class

Constructor Parameters

Methods

initialize

start_run

log_metrics

end_run

is_active

Using with optimize()

create_experiment_tracker

Detailed Metrics Logging

log_detailed_metrics_after_discovering_new_program

Logged Metrics

Example: Complete Logging Setup

Multi-Backend Tracking

Source Reference