Skip to main content
Pareto frontier analysis helps you identify the set of non-dominated solutions - model variants where you cannot improve one metric (like accuracy) without degrading another (like latency or energy). This is essential for selecting the best deployment configuration for your edge device.

What is a Pareto Frontier?

A Pareto frontier is the set of points where no single point is strictly better than another across all dimensions. In edge AI optimization:
A model is Pareto optimal if there is no other model that:
  • Has lower latency (or energy) AND
  • Has equal or better accuracy
For example, consider three models:
  • Model A: 10ms latency, 85% accuracy
  • Model B: 15ms latency, 90% accuracy
  • Model C: 12ms latency, 87% accuracy
Models A and B are Pareto optimal (A is fastest, B is most accurate). Model C is dominated by B (B is more accurate with only slightly higher latency).

The pareto_frontier Function

The pareto_frontier function is defined in src/edge_opt/experiments.py:91-99:
def pareto_frontier(df: pd.DataFrame, x_col: str) -> pd.DataFrame:
    ranked = df[df["accepted"]].sort_values([x_col, "accuracy"], ascending=[True, False]).reset_index(drop=True)
    frontier = []
    best_accuracy = -1.0
    for _, row in ranked.iterrows():
        if row["accuracy"] > best_accuracy:
            frontier.append(row)
            best_accuracy = row["accuracy"]
    return pd.DataFrame(frontier)

Function Signature

1

df: pd.DataFrame

The results DataFrame from run_sweep(), containing all model variants with their metrics
2

x_col: str

The metric to minimize (x-axis). Common choices:
  • "latency_ms" for latency-accuracy frontiers
  • "energy_proxy_j" for energy-accuracy frontiers
3

Returns: pd.DataFrame

A filtered DataFrame containing only the Pareto optimal points, sorted by x_col

Algorithm Walkthrough

The function uses a greedy sweep algorithm to identify Pareto optimal points:

Step 1: Filter to Accepted Variants (Line 92)

ranked = df[df["accepted"]].sort_values([x_col, "accuracy"], ascending=[True, False]).reset_index(drop=True)
The accepted column indicates whether a model variant satisfies the memory budget constraint (see memory-budgets.mdx). Only variants that fit on the target device are considered.
# From run_sweep in experiments.py:77
rejected = metrics.memory_mb > active_memory_budget_mb
row["accepted"] = not rejected

Step 2: Sort by x_col then Accuracy

sort_values([x_col, "accuracy"], ascending=[True, False])
This creates a sorted list where:
  1. Primary sort: ascending by x_col (lower latency/energy first)
  2. Secondary sort: descending by accuracy (higher accuracy first)
Example ordering for latency-accuracy:
Indexlatency_msaccuracyReasoning
08.20.89Lowest latency, high accuracy
18.20.85Same latency, lower accuracy
210.50.91Higher latency, highest accuracy
312.10.88Even higher latency, lower accuracy

Step 3: Greedy Frontier Selection (Lines 93-98)

frontier = []
best_accuracy = -1.0
for _, row in ranked.iterrows():
    if row["accuracy"] > best_accuracy:
        frontier.append(row)
        best_accuracy = row["accuracy"]
This loop walks through the sorted list and keeps only rows that improve accuracy:
1

Initialize best_accuracy to -1.0

Start with an impossibly low accuracy so the first row is always selected
2

Iterate through sorted rows

Process each row in order of increasing x_col (latency/energy)
3

Check for improvement

If row["accuracy"] > best_accuracy, this row offers better accuracy than all previous rows at this x_col value
4

Add to frontier

Append the row to the frontier and update best_accuracy
5

Skip dominated points

If accuracy is not better, the row is dominated and excluded

Visual Example

Using the table from Step 2:
Indexlatency_msaccuracySelected?Reason
08.20.89✅ YesFirst point, accuracy = 0.89 > -1.0
18.20.85❌ NoAccuracy 0.85 < 0.89 (dominated by index 0)
210.50.91✅ YesAccuracy 0.91 > 0.89 (new best)
312.10.88❌ NoAccuracy 0.88 < 0.91 (dominated by index 2)
Resulting frontier: Rows 0 and 2

Computing Multiple Frontiers

In practice, you typically compute two frontiers from the same sweep results:
from edge_opt.experiments import run_sweep, pareto_frontier
import pandas as pd

# Run optimization sweep
results_df = run_sweep(
    base_model=model,
    val_loader=val_loader,
    calibration_loader=calib_loader,
    device=device,
    pruning_levels=[0.0, 0.2, 0.4, 0.6, 0.8],
    precisions=["fp32", "fp16", "int8"],
    power_watts=2.5,
    calibration_batches=10,
    memory_budgets_mb=[10.0, 20.0, 50.0],
    active_memory_budget_mb=15.0,
    latency_multiplier=1.0,
    benchmark_repeats=5,
)

# Compute latency-accuracy Pareto frontier
latency_frontier = pareto_frontier(results_df, x_col="latency_ms")
print(f"Latency frontier: {len(latency_frontier)} optimal points")

# Compute energy-accuracy Pareto frontier
energy_frontier = pareto_frontier(results_df, x_col="energy_proxy_j")
print(f"Energy frontier: {len(energy_frontier)} optimal points")
The latency and energy frontiers may contain different models. A model that is Pareto optimal for latency may not be optimal for energy, and vice versa.

Interpreting the Results

Each row in the returned frontier DataFrame represents a deployment option:
print(latency_frontier[["pruning_level", "precision", "latency_ms", "accuracy", "memory_mb"]])

Example Output

pruning_levelprecisionlatency_msaccuracymemory_mb
0.8int85.20.82454.1
0.6int88.70.86787.8
0.4fp1612.30.892111.2
0.2fp3218.90.910314.7
This table tells you:
Configuration: 80% pruning + int8 quantizationTradeoffs:
  • Latency: 5.2ms (fastest)
  • Accuracy: 82.45% (lowest)
  • Memory: 4.1MB (smallest)
Best for: Real-time applications where speed is critical
Configuration: 40-60% pruning + mixed precisionTradeoffs:
  • Latency: 8-12ms (moderate)
  • Accuracy: 86-89% (good)
  • Memory: 7-11MB (moderate)
Best for: General-purpose edge deployment
Configuration: 20% pruning + fp32Tradeoffs:
  • Latency: 18.9ms (slowest)
  • Accuracy: 91.03% (highest)
  • Memory: 14.7MB (largest)
Best for: Applications where accuracy cannot be compromised

Visualization with save_plots

The save_plots function (defined in src/edge_opt/experiments.py:102-143) visualizes the Pareto frontiers:
from edge_opt.experiments import save_plots
from pathlib import Path

save_plots(
    df=results_df,
    latency_frontier=latency_frontier,
    energy_frontier=energy_frontier,
    output_dir=Path("./optimization_results")
)
This generates three plots:

1. Accuracy vs Latency (Lines 107-118)

plt.scatter(accepted["latency_ms"], accepted["accuracy"], c="tab:blue", alpha=0.8, label="Accepted")
if not rejected.empty:
    plt.scatter(rejected["latency_ms"], rejected["accuracy"], c="tab:gray", alpha=0.5, marker="x", label="Rejected")
plt.plot(latency_frontier["latency_ms"], latency_frontier["accuracy"], color="red", linewidth=2, label="Pareto")
  • Blue dots: Accepted variants (within memory budget)
  • Gray X’s: Rejected variants (exceed memory budget)
  • Red line: Pareto frontier connecting optimal points

2. Accuracy vs Energy (Lines 120-131)

plt.scatter(accepted["energy_proxy_j"], accepted["accuracy"], c="tab:green", alpha=0.8, label="Accepted")
plt.plot(energy_frontier["energy_proxy_j"], energy_frontier["accuracy"], color="red", linewidth=2, label="Pareto")
Same format as latency plot, but with energy on x-axis.

3. Accuracy vs Memory (Lines 133-143)

plt.scatter(accepted["memory_mb"], accepted["accuracy"], c="tab:purple", alpha=0.8, label="Accepted")
Shows memory-accuracy tradeoff (no Pareto line, since memory is used as a constraint).
Plots are saved as PNG files with 180 DPI resolution in the specified output directory.

Practical Selection Strategy

Here’s how to choose a model from the Pareto frontier:
1

Define your primary constraint

Determine your hard requirement:
  • Latency budget: “Must be < 20ms”
  • Energy budget: “Must be < 0.5J per inference”
  • Accuracy requirement: “Must be > 85%”
2

Filter the frontier

# Example: latency budget of 15ms
candidates = latency_frontier[latency_frontier["latency_ms"] <= 15.0]
3

Select based on secondary objective

# Choose highest accuracy within budget
best_model = candidates.loc[candidates["accuracy"].idxmax()]

print(f"Selected configuration:")
print(f"  Pruning: {best_model['pruning_level']}")
print(f"  Precision: {best_model['precision']}")
print(f"  Accuracy: {best_model['accuracy']:.2%}")
print(f"  Latency: {best_model['latency_ms']:.1f}ms")
4

Validate on real hardware

Deploy the selected configuration to your target device and verify that benchmarked metrics match real-world performance.

Advanced Use Cases

Compute frontiers for multiple metrics and find the intersection:
latency_optimal = set(latency_frontier.index)
energy_optimal = set(energy_frontier.index)
both_optimal = latency_optimal & energy_optimal

print(f"Models optimal for both latency AND energy: {len(both_optimal)}")
Compute separate frontiers for different deployment scenarios:
devices = [
    {"name": "RPi4", "budget_mb": 20.0, "freq_scale": 1.0},
    {"name": "RPi0", "budget_mb": 8.0, "freq_scale": 0.5},
]

for device in devices:
    results = run_sweep(..., active_memory_budget_mb=device["budget_mb"], latency_multiplier=1.0/device["freq_scale"])
    frontier = pareto_frontier(results, "latency_ms")
    print(f"{device['name']}: {len(frontier)} optimal configs")
You can use pareto_frontier with any metric in your DataFrame:
# Optimize for P95 latency instead of mean
p95_frontier = pareto_frontier(results_df, x_col="latency_p95_ms")

# Optimize for throughput (need to negate since we want to maximize)
results_df["neg_throughput"] = -results_df["throughput_sps"]
throughput_frontier = pareto_frontier(results_df, x_col="neg_throughput")

Common Issues and Solutions

Empty FrontierIf pareto_frontier returns an empty DataFrame:
  1. Check that active_memory_budget_mb is not too restrictive (see line 92 filter)
  2. Verify that your sweep includes diverse configurations
  3. Ensure accuracy values are varying (not all identical)
print(f"Accepted variants: {results_df['accepted'].sum()} / {len(results_df)}")
if results_df["accepted"].sum() == 0:
    print("ERROR: No variants within memory budget!")
    print(f"Minimum memory: {results_df['memory_mb'].min():.1f}MB")
    print(f"Budget: {active_memory_budget_mb}MB")
Single-Point FrontierIf the frontier contains only one point, your sweep may not be diverse enough:
if len(latency_frontier) == 1:
    print("WARNING: Only one Pareto optimal point found")
    print("Consider expanding pruning_levels or precisions")
Try increasing the granularity of your sweep parameters.
  • run_sweep() - Generates the input DataFrame (src/edge_opt/experiments.py:47)
  • save_plots() - Visualizes Pareto frontiers (src/edge_opt/experiments.py:102)
  • memory_violations() - Determines accepted/rejected status (src/edge_opt/metrics.py:66)

Build docs developers (and LLMs) love