Use Cases Overview

GEPA (Genetic-Pareto) optimizes any system with textual parameters against any evaluation metric. From prompt optimization to agent architecture discovery, GEPA has demonstrated significant improvements across production systems at companies like Databricks, Shopify, Dropbox, and OpenAI.

Why GEPA?

If you can measure it, you can optimize it: prompts, code, agent architectures, scheduling policies, vector graphics, and more. Unlike RL or gradient-based methods that collapse execution traces into a single scalar reward, GEPA uses LLMs to read full execution traces — error messages, profiling data, reasoning logs — to diagnose why a candidate failed and propose targeted fixes.

Key Results at a Glance

90x Cheaper

Open-source models + GEPA beat Claude Opus 4.1 at Databricks

35x Faster than RL

100–500 evaluations vs. 5,000–25,000+ for GRPO

32% → 89%

ARC-AGI agent accuracy via architecture discovery

40.2% Cost Savings

Cloud scheduling policy discovered by GEPA, beating expert heuristics

55% → 82%

Coding agent resolve rate on Jinja via auto-learned skills

50+ Production Uses

Across Shopify, Databricks, Dropbox, OpenAI, Pydantic, MLflow, Comet ML

Use Case Categories

Prompt Optimization

Improve LLM accuracy through reflective prompt evolution. Achieve 46.6% → 56.6% on AIME with GPT-4.1 Mini.

Code Optimization

Generate and optimize code including CUDA kernels, scheduling policies, and system algorithms.

Agent Architecture

Discover optimal agent designs through evolutionary search. Nearly triple accuracy on ARC-AGI.

RAG Optimization

Optimize retrieval pipelines, reranking strategies, and query generation for better context.

When GEPA Shines

Expensive Rollouts

Scientific simulations, complex agents with tool calls, slow compilation. GEPA needs 100–500 evals vs 10K+ for RL.

Scarce Data

Works with as few as 3 examples. No large training sets required.

API-Only Models

No weights access needed. Optimize GPT-5, Claude, Gemini directly through their APIs.

Interpretability

Human-readable optimization traces show why each prompt changed.

Complements RL

Use GEPA for rapid initial optimization, then apply RL/fine-tuning for additional gains.

Real-World Impact

“Both DSPy and (especially) GEPA are currently severely under hyped in the AI context engineering world” — Tobi Lutke, CEO, Shopify

Production Deployments

GEPA is used in production at:

Databricks: 90x cost reduction with enterprise agents
Shopify: Optimizing AI context engineering
Dropbox: Production AI systems
OpenAI: Featured in official cookbook for self-evolving agents
Pydantic: Contact extraction improved from 86% → 97%
MLflow: Integrated as mlflow.genai.optimize_prompts()
Comet ML: Core algorithm in Opik Agent Optimizer

Research Impact

35x faster than RL: 100–500 evaluations vs. 5,000–25,000+ for GRPO (paper)
State-of-the-art results: MATH benchmark 67% → 93% with DSPy Full Program optimization
Sample efficiency: Works with minimal data where RL requires thousands of examples

Getting Started

Ready to optimize your AI system? Choose your use case:

Quick Start

Get up and running with GEPA in minutes

Adapters Guide

Learn how to integrate GEPA with your system

API Reference

Complete API documentation

Community

Join our Discord community

Optimization Modes

GEPA supports three optimization paradigms:

Single-Task Search: Solve one hard problem (e.g., circle packing, blackbox optimization)
Multi-Task Search: Solve a batch of related problems with cross-transfer (e.g., CUDA kernels)
Generalization: Build a skill that transfers to unseen problems (e.g., prompt optimization, agent architecture)

The same API works across all modes — learn more in each use case section.

Get Started

Core Concepts

Guides

Use Cases

Use Cases Overview

Why GEPA?

Key Results at a Glance

90x Cheaper

35x Faster than RL

32% → 89%

40.2% Cost Savings

55% → 82%

50+ Production Uses

Use Case Categories

Prompt Optimization

Code Optimization

Agent Architecture

RAG Optimization

When GEPA Shines

Real-World Impact

Production Deployments

Research Impact

Getting Started

Quick Start

Adapters Guide

API Reference

Community

Optimization Modes

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Use Cases

​Why GEPA?

​Key Results at a Glance

90x Cheaper

35x Faster than RL

32% → 89%

40.2% Cost Savings

55% → 82%

50+ Production Uses

​Use Case Categories

Prompt Optimization

Code Optimization

Agent Architecture

RAG Optimization

​When GEPA Shines

​Real-World Impact

​Production Deployments

​Research Impact

​Getting Started

Quick Start

Adapters Guide

API Reference

Community

​Optimization Modes

Build docs developers (and LLMs) love

Why GEPA?

Key Results at a Glance

Use Case Categories

When GEPA Shines

Real-World Impact

Production Deployments

Research Impact

Getting Started

Optimization Modes