Requirements
- Python: 3.11 or higher
- Dependencies:
metaflow>=2.10dagster>=1.7(for running jobs)
Install from PyPI
The simplest way to installmetaflow-dagster is from PyPI:
The base package only includes
metaflow as a dependency. You’ll need to install dagster separately to run jobs. The dagster-webserver package is optional but recommended for local development.Install from Source
For development or to use the latest features, install directly from the GitHub repository:test extra includes:
dagster>=1.7dagster-webserverpytest>=7pytest-timeoutpytest-cov
Verify Installation
Check thatmetaflow-dagster is installed correctly:
Check Metaflow Extension
Create a simple Metaflow flow and verify the Run the help command:You should see output showing the available Dagster commands:
dagster command is available:my_flow.py
Generate a Dagster Definitions File
Compile your flow to a Dagster definitions file:You should see output like:
Environment Setup
Metadata Service & Datastore
By default,metaflow-dagster uses whatever metadata and datastore backends are active in your Metaflow environment. The generated file bakes in those settings at creation time so every step subprocess uses the same backend.
To use a remote metadata service or object store, configure them before running dagster create:
Dagster Home
For local testing, Dagster will automatically create a temporary SQLite-backed instance. For production, you’ll want to configure a persistentDAGSTER_HOME:
Troubleshooting
Next Steps
Quickstart Guide
Learn how to create and deploy your first flow to Dagster