Skip to main content

Command Syntax

The Debug Extension provides a debug command group with subcommands for debugging tasks:
metaflow debug task [OPTIONS] PATHSPEC

Basic Usage

To debug a specific task:
metaflow debug task FlowName/123/step_name/456 --metaflow-root-dir ~/notebooks/debug_task

Command Options

Required Arguments

PATHSPEC

The pathspec identifying which task to debug. Can be a full or partial pathspec:
  • Full pathspec: FlowName/RunId/StepName/TaskId
  • Step pathspec: FlowName/RunId/StepName (uses unique task)
  • Run pathspec: FlowName/RunId (uses end step)
  • Flow name: FlowName (uses latest successful run and end step)
When using partial pathspecs, the extension will resolve them to a unique task. If multiple tasks exist (e.g., from a foreach), you must specify the full pathspec.

--metaflow-root-dir

Required. The root directory where all generated files will be placed:
--metaflow-root-dir ~/notebooks/debug_task
This directory will contain:
  • The extracted code package
  • Debug scripts and notebooks
  • Escape trampolines
  • Stub generators

Optional Flags

--inspect

Enables inspection mode, allowing you to examine the state of a task after it has finished running:
metaflow debug task FlowName/123/step_name/456 \
  --metaflow-root-dir ~/debug \
  --inspect
In inspect mode, the generated notebook will include code to load and examine all artifacts produced by the task.

--override-env

Use a different named environment instead of the task’s original environment:
metaflow debug task FlowName/123/step_name/456 \
  --metaflow-root-dir ~/debug \
  --override-env my_custom_env
The named environment must already exist. Create it using metaflow environment create --name my_custom_env.

--override-env-from-pathspec

Use the environment from a different task instead of the current task’s environment:
metaflow debug task FlowName/123/step_name/456 \
  --metaflow-root-dir ~/debug \
  --override-env-from-pathspec FlowName/100/other_step/789
This is useful when you want to debug a task with a different set of dependencies or package versions.
Only one of --override-env or --override-env-from-pathspec can be specified at a time.

--quiet / --no-quiet

Suppress or show informational messages during execution:
metaflow debug --quiet task FlowName/123/step_name/456 --metaflow-root-dir ~/debug
Default: --no-quiet (messages are shown)

Pathspec Resolution

The extension uses intelligent pathspec resolution to make debugging easier:

Resolution Logic

From debug_cmd.py:182-219:
path_components = pathspec.split("/")
if not path_components:
    raise CommandException("Provide either a flow, run, step or task to debug")

if len(path_components) < 2:
    # Flow name only - get latest successful run
    r = Flow(path_components[0]).latest_successful_run
    if r is None:
        raise CommandException(
            "Flow {} can only be specified if there is a successful run in the "
            "current namespace.".format(path_components[0])
        )
    path_components.append(r.id)

if len(path_components) < 3:
    # No step specified - use 'end' step
    path_components.append("end")

if len(path_components) < 4:
    # Enforce single task
    cur_task = None
    for t in r[path_components[2]]:
        if cur_task is not None:
            raise CommandException(
                "Step {} does not refer to a single task -- please specify the "
                "task unambiguously".format("/".join(path_components))
            )
        cur_task = t

Examples

# Explicitly specify all components
metaflow debug task HousePricePredictionFlow/1199/fit_gbrt_for_given_param/150671013 \
  --metaflow-root-dir ~/debug

Generated Files and Artifacts

When you run the debug command, several files are generated in the specified root directory:

Directory Structure

~/notebooks/debug_task/
├── code_package.tar.gz          # Downloaded code package
├── <flow_files>.py              # Extracted flow code
├── _escape_trampolines/         # Python path overrides
│   └── metaflow/
├── debug_stub_generator.py      # Artifact stub generator
├── current_stub_generator.py    # Current object stub generator
├── <pathspec>_debug.py          # Debug script
└── debug.ipynb                  # Debug notebook

Debug Script

The debug script (<pathspec>_debug.py) contains:
  • Imports from the task’s step code
  • Stubs for accessing artifacts via self
  • Helper functions for inspecting data
Example script structure:
# Generated debug script
import sys
import os

# Add escape trampolines to path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '_escape_trampolines'))

# Import stub generators
from debug_stub_generator import generate_debug_stub
from current_stub_generator import generate_current_stub

# Create 'self' object with artifact access
self = generate_debug_stub('FlowName/123/step_name/456')

# Your step code can now be executed

Debug Notebook

The generated debug.ipynb notebook includes:
  1. Setup cells - Import statements and environment configuration
  2. Stub generation - Code to create the self object
  3. Step code - Your original step function code, ready to execute
  4. Artifact inspection - Helper cells for examining data
The notebook’s kernel is automatically configured to use the recreated Conda environment, ensuring all dependencies are available.

Escape Trampolines

The _escape_trampolines directory contains Python modules that override Metaflow internals to enable artifact access outside of a running task. This is generated using:
from metaflow.plugins.env_escape import generate_trampolines
generate_trampolines(trampoline_dir)

Using the Debug Notebook

Once the debug environment is set up, you can start debugging:

Launching Jupyter

cd ~/notebooks/debug_task
jupyter notebook debug.ipynb
The notebook will automatically use the correct kernel with the recreated environment.

Accessing Artifacts

In the notebook, artifacts are accessible through the self object:
# Access input artifacts
print(self.input['n_estimators'])
print(self.input['learning_rate'])

# Access data artifacts
print(self.features.shape)
print(self.labels.shape)

# Access any artifact from the task
print(self.index)

Re-executing Step Code

You can re-run your step code with modifications:
# Original step code
from sklearn import ensemble
from sklearn.model_selection import cross_val_score
import numpy as np

estimator = ensemble.GradientBoostingRegressor(
    n_estimators=self.input['n_estimators'],
    learning_rate=self.input['learning_rate'],
    max_depth=self.input['max_depth'],
    min_samples_split=2,
    loss='ls'
)

estimator.fit(self.features, self.labels)

mses = cross_val_score(
    estimator, self.features, self.labels,
    cv=5, scoring='neg_mean_squared_error'
)
rmse = np.sqrt(-mses).mean()

print(f"RMSE: {rmse}")

Experimenting with Parameters

Try different hyperparameters without re-running the entire flow:
# Experiment with different min_samples_split
for min_samples in [2, 3, 5, 10]:
    estimator = ensemble.GradientBoostingRegressor(
        n_estimators=self.input['n_estimators'],
        learning_rate=self.input['learning_rate'],
        max_depth=self.input['max_depth'],
        min_samples_split=min_samples,
        loss='ls'
    )
    estimator.fit(self.features, self.labels)
    mses = cross_val_score(
        estimator, self.features, self.labels,
        cv=5, scoring='neg_mean_squared_error'
    )
    rmse = np.sqrt(-mses).mean()
    print(f"min_samples_split={min_samples}: RMSE={rmse:.4f}")

Environment Configuration

The extension automatically configures the Python environment:

Environment Creation

From debug_utils.py:148-171:
conda_command = [
    "metaflow",
    "environment",
    "--quiet",
    "create",
    "--name",
    mf_env_name,
    "--install-notebook",
    "--force",
]
conda_command.extend(mf_env)

result = subprocess.run(
    conda_command,
    check=True,
    capture_output=True,
)

# Parse the python path from stderr
for line in result.stderr.decode().split("\n"):
    if "/bin/python" in line:
        mf_python_path = line
        break

Jupyter Kernel Configuration

The extension updates the Jupyter kernel’s configuration to include:
  • The escape trampolines in PYTHONPATH
  • Any environment variables needed by the packaged Metaflow version
From debug_cmd.py:315-349:
def _update_kernel_pythonpath(kernelspec_path, metaflow_root_dir):
    kernel_json_path = os.path.join(kernelspec_path, "kernel.json")
    with open(kernel_json_path, "r") as f:
        kernel_json = json.load(f)

    _ = kernel_json.setdefault("env", {})["PYTHONPATH"] = os.path.abspath(
        os.path.join(metaflow_root_dir, "_escape_trampolines")
    )

    for key, value in MetaflowCodeContent.get_env_vars_for_packaged_metaflow(
        metaflow_root_dir
    ).items():
        if key.endswith(":"):
            # Override existing value
            kernel_json["env"][key[:-1]] = value
        elif key not in kernel_json["env"]:
            kernel_json["env"][key] = value
        else:
            # Prepend to existing value
            kernel_json["env"][key] = f"{value}:{kernel_json['env'][key]}"

Limitations

The Debug Extension has the following limitations:
  • Only supports tasks executed remotely (not local runs)
  • Requires the new Conda V2 decorator (included in this package)
  • Only works with tasks that have a code package in metadata

Troubleshooting

Error: “Task does not have code-package”

This error occurs when trying to debug a task that was run locally. The debug extension requires a code package, which is only created for remote executions. Solution: Only debug tasks executed on remote compute (e.g., with @batch, @kubernetes, or other remote decorators).

Error: “Conda environment type not supported”

This error occurs when the task used an older version of the Conda decorator. Solution: Re-run your flow with the enhanced Conda V2 decorator included in this package.

Kernel Not Found

If Jupyter cannot find the kernel:
  1. Check that the environment was created: metaflow environment list
  2. Verify the kernel was installed: jupyter kernelspec list
  3. Manually specify the kernel in Jupyter if needed

Missing Dependencies

If dependencies are missing in the notebook:
  1. Verify the correct environment is active: Check the kernel name in Jupyter
  2. Check environment creation logs for errors
  3. Try recreating the environment with --force

Next Steps

See Examples

Explore complete working examples of debugging different scenarios

Build docs developers (and LLMs) love