Skip to main content

Metaflow Netflix Extensions

This repository contains extensions for Metaflow that are in use at Netflix (or being tested at Netflix) and that are more cutting edge than what is included in the OSS Metaflow package. Netflix released Metaflow as OSS in 2019. Since then, development of Metaflow internally to Netflix has continued primarily around extensions to better support Netflix’s infrastructure. This repository contains functionality that is not yet fully ready for inclusion in the community-supported Metaflow, either because interest is unclear or because there isn’t time in the community to properly integrate and test it.
This extension is currently tested on Python 3.7+. While we do our best to ensure functionality works, it does not have the same levels of support and backward compatibility guarantees that Metaflow does.

Key Features

Conda v2

Advanced dependency management with improved performance and flexibility

Debug Extension

Interactive debugging with Jupyter notebook integration

Mixed Environments

Mix and match Conda and PyPI packages seamlessly

Environment Sharing

Named environments for easy team collaboration

What’s Included

Conda v2

The refactored and improved Conda decorator provides several enhancements over the standard Metaflow Conda decorator:
  • Mix Conda and PyPI packages - Seamlessly combine packages from both ecosystems
  • Enhanced PyPI support - Support for repositories, source tarballs, and more
  • Command-line tools - Retrieve and re-hydrate any environment from previous runs
  • Named environments - Easy environment sharing and saving across teams
  • Better performance - Parallel resolution and downloading of packages
  • Multiple resolvers - Support for conda, mamba, and micromamba
  • Cloud caching - Environment and package caching to S3/Azure/GS
Version 1.0.0 is considered stable. See the Conda v2 documentation for detailed information.

Debug Extension

The debug extension allows you to seamlessly debug executed steps in an isolated Jupyter notebook instance with appropriate dependencies by leveraging the Conda extension. Key capabilities:
  • Download your code package from any previous run
  • Automatically set up the correct conda/pip packages for that step
  • Generate stubs to access relevant artifacts for that run
  • Create a ready-to-use debugging notebook
This extension currently only works with the version of Conda included in this package.

Requirements

  • Metaflow: Version 2.16.0 or later
  • Python: 3.7.2 or higher

Getting Started

1

Install the package

Install this package alongside the metaflow package:
pip install metaflow-netflixext
2

Verify installation

Check that the extensions are properly installed:
python -c "import metaflow_extensions.netflix_ext; print('Extensions installed successfully')"
3

Try it out

Create a simple flow with the Conda v2 decorator:
from metaflow import FlowSpec, step, conda

class HelloFlow(FlowSpec):
    @conda(libraries={"pandas": "1.5.0"})
    @step
    def start(self):
        import pandas as pd
        print(f"Pandas version: {pd.__version__}")
        self.next(self.end)
    
    @step
    def end(self):
        print("Done!")

if __name__ == "__main__":
    HelloFlow()

Support

You can find support for this extension on the Metaflow Slack. If you have any questions, feel free to open an issue on GitHub or contact us on the usual Metaflow slack channels.

Next Steps

Installation

Detailed installation and configuration guide

Quickstart

Get up and running in minutes

Conda v2 Overview

Learn about the improved Conda decorator

Debug Extension

Start debugging your Metaflow steps

Build docs developers (and LLMs) love