Skip to main content
Named environments in Metaflow act like Docker tags—they provide human-readable aliases for fully resolved environments, making it easy to share environments across teams, flows, and users.

Overview

Every resolved environment has a unique identifier (a 40-character hex string), but these are unwieldy:
metaflow_d49465b2b45996e40aad1f3aaf00cba553a0f085_f671c941b27764ad6536f7fdadc6c57b18221c6e
Named environments let you reference this as:
mlp/metaflow/conda_example/numpy_test_env:v1

Creating Named Environments

From requirements.txt

Resolve an environment and create an alias:
metaflow environment resolve \
  --python ">=3.8,<3.9" \
  -r req_numpy.txt \
  --alias mlp/metaflow/conda_example/numpy_test_env
The environment is now available as mlp/metaflow/conda_example/numpy_test_env:latest.

From environment.yml

metaflow environment resolve \
  -f env_numpy.yml \
  --alias mlp/ml-team/numpy-env:v1

From a Flow Resolution

python myflow.py --environment=conda environment resolve \
  --alias mlp/myflow/prod-env:v2
This resolves all step environments in your flow and aliases them.

Alias Naming Convention

We recommend this format:
<team>/<project>/<environment-name>:<tag>
Examples:
mlp/metaflow/conda_example/numpy_test_env:v1
data-eng/pipeline/spark-env:latest
ml-platform/training/gpu-torch:stable
Use / separators for hierarchy:
  • team - Your team name
  • project - Project or flow name
  • environment-name - Descriptive name

Mutable vs Immutable Aliases

Immutable (Default)

Most tags are immutable:
  • :v1, :v2, :prod
  • Cannot be reassigned
  • Ensures reproducibility
# First time: succeeds
metaflow env resolve -r req.txt --alias my/env:v1

# Second time: fails
metaflow env resolve -r req2.txt --alias my/env:v1
# Error: Alias already exists

Mutable

Three special tags are mutable:
  • :latest
  • :stable
  • :candidate
# Overwrites existing :latest
metaflow env resolve -r req.txt --alias my/env:latest

# Updates :latest to new resolution
metaflow env resolve -r req2.txt --alias my/env:latest
Mutable tags support a “rolling latest” pattern where teams can update the default environment without changing flow code.

Using Named Environments

In Decorators

The @named_env decorator references an aliased environment:
flow_with_named_env.py
from metaflow import FlowSpec, step, named_env, conda

class CondaTestFlow(FlowSpec):
    
    @named_env(name="mlp/metaflow/conda_example/numpy_test_env")
    @step
    def start(self):
        import numpy as np
        print("I have numpy version %s" % np.__version__)
        assert np.__version__ == "1.21.5"
        self.next(self.end)
        
    @conda(libraries={"pandas": "1.5.0"}, python=">=3.8,<3.9")
    @step
    def end(self):
        import pandas as pd
        assert pd.__version__ == "1.5.0"
        print("I am in end and Pandas version is %s" % pd.__version__)

if __name__ == "__main__":
    CondaTestFlow()
Run:
python flow_with_named_env.py --environment=conda run
Metaflow automatically fetches the environment from S3/Azure/GS if it’s not locally present.

Flow-Level Named Environments

Use @named_env_base for shared dependencies:
from metaflow import FlowSpec, step, named_env_base, pypi

@named_env_base(name="mlp/team/base-env:v3")
class MyFlow(FlowSpec):
    
    @step
    def start(self):
        # Uses base-env:v3
        import numpy
        self.next(self.end)
    
    @pypi(packages={"tensorflow": "2.12.0"})
    @step  
    def train(self):
        # Uses base-env:v3 PLUS tensorflow
        import numpy, tensorflow
        self.next(self.end)
    
    @step
    def end(self):
        pass

Using Pathspecs

Reference a previous step’s environment using a pathspec:
@named_env(pathspec="MyFlow/123/train")
@step
def debug(self):
    # Uses exact same environment as MyFlow/123/train
    import tensorflow
Pathspec format:
<FlowName>/<RunId>/<StepName>
Partial pathspecs work:
  • MyFlow - Latest run, end step
  • MyFlow/123 - Run 123, end step
  • MyFlow/123/train - Specific step
Pathspecs are automatically aliased when environments are resolved, so they’re cached just like named environments.

Fetching Environments

Manual Fetch

Explicitly fetch a named environment:
metaflow environment get mlp/metaflow/conda_example/numpy_test_env
Output:
DEFAULT Environment of type conda-only full hash d49465b2b45996e40aad1f3aaf00cba553a0f085:f671c941b27764ad6536f7fdadc6c57b18221c6e
Aliases mlp/metaflow/conda_example/numpy_test_env:latest (mutable)
Arch linux-64
Available on linux-64

Resolved on 2023-09-06 07:16:01.794427
Resolved by romain

Locally present as /usr/local/libexec/metaflow-condav2-20230809/envs/metaflow_d49465b2b45996e40aad1f3aaf00cba553a0f085_f671c941b27764ad6536f7fdadc6c57b18221c6e

User-requested packages conda::boto3==>=1.14.0, conda::numpy==1.21.5, conda::pandas==>=0.24.0, ...

Automatic Fetch

Metaflow automatically fetches named environments when:
  • Running a flow with @named_env
  • Deploying to Argo/Airflow/Step Functions
  • Creating a local environment with metaflow environment create

Remote Resolution

By default, Metaflow only uses locally resolved environments. You can configure it to check remote storage for environments resolved by other users.

Configuration

Set METAFLOW_CONDA_USE_REMOTE_LATEST in your config:
METAFLOW_CONDA_USE_REMOTE_LATEST = ":none:"
Behavior: Never use remote environments unless explicitly aliased.
  • Forces re-resolution locally
  • Most reproducible (you control resolution)
  • Slower on first run

How It Works

1

Check Local

Metaflow first checks if the environment is resolved locally.
2

Check Remote

If not local and remote lookup is enabled, check S3/Azure/GS for environments matching:
  • Same requirement ID (user packages + constraints)
  • Resolved by allowed users
3

Use Remote or Re-resolve

If found remotely, download and use it. Otherwise, re-resolve locally.

Team Collaboration Workflow

Here’s a recommended workflow for teams:

Setup: Designated Resolver

1

Create Base Environment

One team member (or CI/CD) resolves the base environment:
metaflow environment resolve \
  -f team_base_env.yml \
  --alias mlp/team/base:v5
2

Share with Team

Environment is automatically uploaded to S3/Azure/GS. Team members reference it:
@named_env_base(name="mlp/team/base:v5")
class MyFlow(FlowSpec):
    ...
3

Version Updates

When updating dependencies:
# Immutable versioning
metaflow environment resolve \
  -f team_base_env.yml \
  --alias mlp/team/base:v6
Or:
# Mutable :latest
metaflow environment resolve \
  -f team_base_env.yml \
  --alias mlp/team/base:latest

Configuration for Team Members

Team members set:
~/.metaflowconfig/config.json
{
  "METAFLOW_CONDA_USE_REMOTE_LATEST": "alice,bob"  # Trusted resolvers
}
Now they can run:
python myflow.py --environment=conda run
Metaflow will use mlp/team/base:v5 from remote storage without re-resolving.

Extending Named Environments

Add packages to an existing named environment:
env_extra.yml
dependencies:
  - itsdangerous=2.1.2
metaflow environment resolve \
  --using mlp/metaflow/conda_example/numpy_test_env \
  --alias mlp/metaflow/conda_example/numpydanger_test_env \
  -f env_extra.yml
All locked packages from the base environment are preserved. You can only add compatible packages.
Pathspecs also work:
metaflow environment resolve \
  --using-pathspec MyFlow/123/train \
  --alias mlp/debug/train-plus-tools \
  -r extra_tools.txt

Inspecting Environments

Show Command

View details about any environment:
# By alias
metaflow environment show mlp/metaflow/conda_example/numpy_test_env

# By pathspec  
metaflow environment show --pathspec CondaTestFlow/118/start

# By full ID
metaflow environment show d49465b2b45996e40aad1f3aaf00cba553a0f085:f671c941b27764ad6536f7fdadc6c57b18221c6e
Output includes:
  • All installed packages (Conda + PyPI)
  • Resolution date and user
  • Available aliases
  • Local/remote locations

Listing Environments

List all locally cached environments:
metaflow environment --help

Best Practices

Use Semantic Versioning

Tag environments with versions:
  • base:v1, base:v2 for immutable
  • Use :latest only in development

Document Dependencies

Check in requirements.txt or environment.yml to version control alongside the alias name.

Centralize Resolution

Use CI/CD or designated machines to resolve production environments consistently.

Trust Explicitly

Configure CONDA_USE_REMOTE_LATEST to trust specific users, not :any:.
1

Start with Named Environments

Even if you’re a solo developer, use named environments. They make it easier to reproduce environments later.
2

Graduate to Shared

Once stable, share with the team by documenting the alias in your project README.
3

Version Production

Use immutable tags (:v1, :v2) for production flows. Never use :latest in production.

Build docs developers (and LLMs) love