Actionable Side Information (ASI)

The Gradient Analogy

In numerical optimization, gradients tell us which direction to move:

# Neural network training
for x, y in dataloader:
    pred = model(x)
    loss = criterion(pred, y)
    
    # Gradient: direction of steepest descent
    loss.backward()  # Computes ∂loss/∂weights
    
    # Update: move weights in direction of negative gradient
    optimizer.step()  # weights -= lr * gradient

The gradient is actionable: it directly specifies how to improve the parameters. But text parameters have no gradient. Prompts, code, and instructions are discrete, symbolic objects. You can’t compute ∂performance/∂prompt.

Actionable Side Information (ASI)

ASI is the text-optimization analogue of the gradient. Instead of numerical derivatives, ASI provides diagnostic feedback that tells an LLM:

Why a candidate failed
What went wrong
How to fix it

Traditional optimizers receive only scalar feedback:

# Traditional optimizer view
candidate = "Some text parameter"
score = evaluate(candidate)  # Returns: 0.45

# We know the score is bad, but...
# - Why did it fail?
# - Which examples were wrong?
# - What errors occurred?
# - How should we modify the text?

GEPA evaluators return rich diagnostic information:

# GEPA evaluator with ASI
import gepa.optimize_anything as oa

def evaluate(candidate, example):
    result = run_system(candidate, example)
    
    score = compute_score(result)
    
    # ASI: structured diagnostic feedback
    side_info = {
        "Input": example["question"],
        "Output": result.answer,
        "Expected": example["correct_answer"],
        "Reasoning": result.reasoning_trace,
        "Error": result.error_message if result.failed else None,
    }
    
    return score, side_info

The LLM reads this ASI and proposes targeted improvements:

LLM sees:
  Input: "What is 2+2?"
  Output: "The answer is 5"
  Expected: "4"
  Error: "Basic arithmetic failure"

LLM proposes:
  "Add instruction: 'Double-check arithmetic calculations
  using the following format: 2 + 2 = 4'"

ASI as the “Why” Signal

Traditional optimizers know that something failed. GEPA knows why:

Traditional	GEPA (with ASI)
`score = 0.0`	`"Compilation error: undefined variable 'tmp'"`
`score = 0.3`	`"Output format wrong: expected JSON, got plain text"`
`score = 0.6`	`"Correct on simple cases, fails on negative inputs"`
`score = 0.9`	`"Nearly perfect; minor edge case: empty list handling"`

This transforms optimization from random search to guided refinement.

ASI Structure

ASI is represented as a dictionary with two categories of information:

1. Scores (Optional)

Multi-objective metrics for Pareto tracking:

side_info = {
    "scores": {
        "accuracy": 0.85,
        "latency_inv": 12.5,  # Higher is better (1/latency)
        "cost_inv": 8.3,
    },
    # ... contextual fields below
}

All scores must follow “higher is better” convention. Invert metrics like latency or error rate.

2. Contextual Fields

Diagnostic feedback that explains performance: Common conventions:

side_info = {
    # What went in
    "Input": example["question"],
    "Context": example["background_knowledge"],
    
    # What came out
    "Output": result.generated_text,
    "Expected": example["correct_answer"],
    
    # Why it failed/succeeded
    "Feedback": qualitative_assessment(result),
    "Error": traceback if exception else None,
    
    # Intermediate states
    "Reasoning": result.chain_of_thought,
    "Tool Calls": result.function_invocations,
    "Profiling": result.execution_metrics,
}

Best practices:

Be specific: “Expected 42, got 17” beats “Wrong answer”
Include errors prominently: Tracebacks, compiler messages, validation failures
Show intermediate steps: Reasoning chains, tool outputs, state transitions
Add context: What was the input? What should have happened?
Use consistent field names: Pick a convention and stick to it

Providing ASI: Three Methods

Method 1: Return Tuple from Evaluator

Explicitly return (score, side_info):

def evaluate(candidate, example):
    result = run_code(candidate["code"], example["input"])
    
    score = 1.0 if result.output == example["expected"] else 0.0
    
    side_info = {
        "Input": example["input"],
        "Output": result.output,
        "Expected": example["expected"],
        "Execution Time": result.time_ms,
        "Memory Used": result.memory_mb,
    }
    
    if result.error:
        side_info["Error"] = str(result.error)
    
    return score, side_info

Method 2: Use `oa.log()`

Log diagnostics imperatively during evaluation:

import gepa.optimize_anything as oa

def evaluate(candidate, example):
    oa.log(f"Input: {example['question']}")
    
    result = run_system(candidate, example)
    oa.log(f"Output: {result.answer}")
    oa.log(f"Expected: {example['correct_answer']}")
    
    if result.error:
        oa.log(f"ERROR: {result.error}")
    
    score = compute_score(result, example)
    return score  # side_info automatically captured under "log" key

Captured output:

side_info = {
    "log": """
        Input: What is 2+2?
        Output: The answer is 5
        Expected: 4
        ERROR: Basic arithmetic failure
    """
}

oa.log() is thread-safe and works from child threads when properly propagated via oa.get_log_context() / oa.set_log_context().

Method 3: Automatic `stdio` Capture

Capture all print() statements automatically:

config = GEPAConfig(
    engine=EngineConfig(
        capture_stdio=True  # Enable automatic capture
    )
)

def evaluate(candidate, example):
    print(f"Testing on: {example['input']}")  # Captured to side_info["stdout"]
    
    result = run_system(candidate, example)
    print(f"Result: {result.output}")  # Also captured
    
    return compute_score(result)

Captured output:

side_info = {
    "stdout": "Testing on: input_1\nResult: output_xyz\n",
    "stderr": ""  # Empty if no stderr
}

capture_stdio=True captures Python-level output only (sys.stdout/sys.stderr). C extensions or subprocesses that write directly to file descriptors require manual capture via oa.log().

Parameter-Specific Side Information

When optimizing multiple parameters, provide targeted feedback for each:

def evaluate(candidate, example):
    # candidate = {"system_prompt": "...", "few_shot_examples": "..."}
    
    result = run_system(candidate, example)
    score = compute_score(result)
    
    side_info = {
        # Top-level feedback (applies to all parameters)
        "Input": example["question"],
        "Output": result.answer,
        "Expected": example["correct_answer"],
        
        # System prompt-specific feedback
        "system_prompt_specific_info": {
            "scores": {"instruction_following": 0.7},
            "Analysis": "Prompt too vague about output format",
            "Suggestion": "Add explicit formatting instructions"
        },
        
        # Few-shot example-specific feedback
        "few_shot_examples_specific_info": {
            "scores": {"coverage": 0.4},
            "Analysis": "No examples with negative numbers",
            "Suggestion": "Add edge case examples"
        }
    }
    
    return score, side_info

During reflection on parameter X, GEPA merges:

Top-level fields (generic feedback)
X_specific_info fields (targeted feedback)

This gives the reflection LM both general context and parameter-specific diagnostics.

Visual ASI with Images

For visual tasks (rendering, charts, UI), include images in ASI:

from gepa import Image
import matplotlib.pyplot as plt

def evaluate(candidate, example):
    # Generate SVG or render output
    output_svg = render_svg(candidate["svg_code"])
    expected_svg = example["target_svg"]
    
    # Create comparison visualization
    fig, (ax1, ax2) = plt.subplots(1, 2)
    ax1.imshow(parse_svg(output_svg))
    ax1.set_title("Generated")
    ax2.imshow(parse_svg(expected_svg))
    ax2.set_title("Expected")
    
    # Convert to PIL Image
    from io import BytesIO
    buf = BytesIO()
    fig.savefig(buf, format='png')
    buf.seek(0)
    comparison_image = PIL.Image.open(buf)
    
    score = compute_visual_similarity(output_svg, expected_svg)
    
    return score, {
        "Input": example["description"],
        "Comparison": Image(comparison_image),  # VLM will see this
        "Feedback": "Colors match, but proportions are off"
    }

Image ASI requires a vision-language model (VLM) as the reflection LM. Set reflection_lm="openai/gpt-4o" or similar.

ASI vs Reward Shaping

In RL, reward shaping adds heuristic signals to guide learning:

# RL reward shaping
reward = goal_achieved * 10.0  # Sparse signal
reward += distance_to_goal * 0.1  # Dense guidance
reward -= collisions * 5.0  # Penalty

Problems with reward shaping:

Hard to design (requires domain expertise)
Brittle (small changes break learning)
Can introduce unintended incentives
Not interpretable (why was this action taken?)

ASI advantages:

Natural language diagnostics (easy to write)
Flexible (add/remove fields freely)
Interpretable (human-readable explanations)
Direct guidance (“fix this specific error”)

What Makes Good ASI?

ASI quality directly impacts optimization speed and quality:

✅ Good ASI

side_info = {
    "Input": "Translate 'Hello world' to French",
    "Output": "Salut monde",
    "Expected": "Bonjour le monde",
    "Feedback": """
        Translation is too informal ('Salut' instead of 'Bonjour').
        Missing article 'le' before 'monde'.
        Context suggests formal register is appropriate.
    """,
    "Tone Score": 0.3,  # Quantify formality
}

Why it’s good:

Specific error identification
Explanation of why it’s wrong
Context for the correct choice
Quantified metric for tracking

❌ Bad ASI

side_info = {
    "Output": "Salut monde",
    "Feedback": "Wrong translation"
}

Why it’s bad:

No explanation of why it’s wrong
Missing input/expected for context
No guidance on how to fix it
LLM has to guess the root cause

🎯 Excellent ASI

side_info = {
    "Input": {"text": "Hello world", "context": "Business email"},
    "Output": "Salut monde",
    "Expected": "Bonjour le monde",
    "Error Analysis": """
        1. Register mismatch: 'Salut' is informal, context requires formal
        2. Grammatical error: Missing definite article 'le'
        3. Vocabulary: 'monde' alone is correct but incomplete
    """,
    "Correction Strategy": """
        - Use 'Bonjour' for formal contexts (vs 'Salut' for casual)
        - Always include articles: 'le monde' not just 'monde'
        - Check context field for register requirements
    """,
    "scores": {
        "grammatical_correctness": 0.5,
        "tone_appropriateness": 0.2,
        "fluency": 0.8
    },
    "Similar Failures": ["Example 3", "Example 7"],  # Pattern
}

Why it’s excellent:

Structured error breakdown
Explicit correction strategy
Multi-objective metrics
Pattern identification across examples

Common ASI Patterns

Coding Tasks

side_info = {
    "Input": test_case,
    "Generated Code": candidate["code"],
    "Execution Result": result.output,
    "Expected Output": test_case["expected"],
    "Compilation Errors": result.compile_errors,
    "Runtime Errors": result.runtime_errors,
    "Test Status": "FAILED",
    "Profiling": {
        "time_ms": result.time_ms,
        "memory_mb": result.memory_mb,
    },
    "Coverage": result.line_coverage,
}

Agent Tasks

side_info = {
    "User Query": example["query"],
    "Agent Trajectory": [
        {"action": "search", "input": "Paris weather", "output": "..."},
        {"action": "answer", "input": "...", "output": result.answer},
    ],
    "Final Answer": result.answer,
    "Expected Answer": example["correct_answer"],
    "Tool Errors": result.tool_errors,
    "Reasoning Quality": human_eval_score(result),
    "Efficiency": f"Used {len(result.actions)} actions (optimal: 2)",
}

Math/Reasoning Tasks

side_info = {
    "Problem": example["question"],
    "Chain of Thought": result.reasoning,
    "Final Answer": result.answer,
    "Correct Answer": example["solution"],
    "Verification": verify_solution(result),
    "Error Type": classify_error(result, example),
    "Difficulty": example["difficulty_level"],
    "scores": {
        "correctness": 1.0 if correct else 0.0,
        "reasoning_quality": rate_reasoning(result.reasoning),
    }
}

ASI in Practice: Example Flow

Iteration 1:

# Candidate fails on 2/3 examples in minibatch
side_info = [
    {
        "Input": "2+2",
        "Output": "The answer is 5",
        "Expected": "4",
        "Error": "Basic arithmetic"
    },
    {
        "Input": "10*10",
        "Output": "100",
        "Expected": "100",
        "Feedback": "Correct!"
    },
    {
        "Input": "-3 + 5",
        "Output": "The answer is -8",
        "Expected": "2",
        "Error": "Wrong sign handling"
    }
]

# LLM reflects:
"Two failures, both arithmetic. First: basic error (2+2=5).
Second: negative number handling incorrect. Success on simple
multiplication. Propose: Add step-by-step arithmetic verification."

# New candidate adds:
"Before answering, verify: if addition, add left-to-right.
For negative numbers, note the sign explicitly."

Iteration 2:

# New candidate tested on same minibatch
side_info = [
    {"Input": "2+2", "Output": "4", "Expected": "4", "Feedback": "Fixed!"},
    {"Input": "10*10", "Output": "100", "Expected": "100", "Feedback": "Still correct"},
    {"Input": "-3 + 5", "Output": "2", "Expected": "2", "Feedback": "Fixed!"},
]

# Minibatch score: 0/3 → 3/3. Accepted!
# Full validation evaluation determines final score.

Next Steps

How GEPA Works

See how ASI fits into the full optimization loop

Reflective Evolution

Understand how LLMs use ASI to propose improvements

Building Adapters

Learn how to capture rich ASI in your adapter

Examples

See real ASI examples from GEPA use cases

Get Started

Core Concepts

Guides

Use Cases

Actionable Side Information (ASI)

The Gradient Analogy

Actionable Side Information (ASI)

ASI as the “Why” Signal

ASI Structure

1. Scores (Optional)

2. Contextual Fields

Providing ASI: Three Methods

Method 1: Return Tuple from Evaluator

Method 2: Use `oa.log()`

Method 3: Automatic `stdio` Capture

Parameter-Specific Side Information

Visual ASI with Images

ASI vs Reward Shaping

What Makes Good ASI?

✅ Good ASI

❌ Bad ASI

🎯 Excellent ASI

Common ASI Patterns

Coding Tasks

Agent Tasks

Math/Reasoning Tasks

ASI in Practice: Example Flow

Next Steps

How GEPA Works

Reflective Evolution

Building Adapters

Examples

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Use Cases

​The Gradient Analogy

​Actionable Side Information (ASI)

​ASI as the “Why” Signal

​ASI Structure

​1. Scores (Optional)

​2. Contextual Fields

​Providing ASI: Three Methods

​Method 1: Return Tuple from Evaluator

​Method 2: Use oa.log()

​Method 3: Automatic stdio Capture

​Parameter-Specific Side Information

​Visual ASI with Images

​ASI vs Reward Shaping

​What Makes Good ASI?

​✅ Good ASI

​❌ Bad ASI

​🎯 Excellent ASI

​Common ASI Patterns

​Coding Tasks

​Agent Tasks

​Math/Reasoning Tasks

​ASI in Practice: Example Flow

​Next Steps

How GEPA Works

Reflective Evolution

Building Adapters

Examples

Build docs developers (and LLMs) love

The Gradient Analogy

Actionable Side Information (ASI)

ASI as the “Why” Signal

ASI Structure

1. Scores (Optional)

2. Contextual Fields

Providing ASI: Three Methods

Method 1: Return Tuple from Evaluator

Method 2: Use `oa.log()`

Method 3: Automatic `stdio` Capture

Parameter-Specific Side Information

Visual ASI with Images

ASI vs Reward Shaping

What Makes Good ASI?

✅ Good ASI

❌ Bad ASI

🎯 Excellent ASI

Common ASI Patterns

Coding Tasks

Agent Tasks

Math/Reasoning Tasks

ASI in Practice: Example Flow

Next Steps