Skip to main content
Environment resolution is the process of converting your high-level dependency specifications (like pandas>=1.0) into a concrete list of pinned package versions.

When Resolution Happens

Metaflow resolves environments in two scenarios:

Automatic Resolution

Environments are automatically resolved:
1
Before Flow Execution
2
When you run a flow with --environment=conda:
3
python myflow.py --environment=conda run
4
Before Deployment
5
When you deploy a flow to a scheduler:
6
python myflow.py --environment=conda argo-workflows create
python myflow.py --environment=conda step-functions create
7
On Local Machine
8
Important: Resolution always happens on your local machine, never on remote compute nodes.

Manual Resolution

You can manually resolve environments using CLI commands:
Resolve all environments in a flow:
python myflow.py --environment=conda environment resolve

Understanding Environment IDs

Each environment has two identifiers:

Requirement ID (req_id)

A hash of your user-specified requirements:
  • Package names and version constraints
  • Python version constraint
  • Channels or sources
# These produce the same req_id:
@conda(libraries={"pandas": ">=1.0"}, python="3.8")
@conda(libraries={"pandas": ">=1.0"}, python="3.8")

# These produce different req_ids:
@conda(libraries={"pandas": ">=1.0"}, python="3.8")
@conda(libraries={"pandas": ">=1.1"}, python="3.8")

Full ID (full_id)

A hash of all resolved packages with exact versions:
  • Every conda package installed
  • Every pypi package installed
  • Exact version numbers
Multiple full_ids can exist for the same req_id if you resolve at different times (as new package versions become available).

Default Environments

For each req_id, Metaflow maintains a default environment:
The default environment is the latest locally resolved environment for a given req_id. When you run a flow, Metaflow uses the default environment unless you force re-resolution.
Default environments prevent unnecessary re-resolution. If you run the same flow twice, the second run uses the same environment as the first, ensuring reproducibility.
A new default is created when:
  • You force re-resolution with --force
  • You resolve on a new machine that doesn’t have the environment locally
  • The requirement ID changes (different dependencies)

Forcing Re-Resolution

By default, Metaflow reuses previously resolved environments. To force re-resolution:

Using Flow Command

python myflow.py --environment=conda environment resolve --force
Output:
Metaflow 2.9.12+netflix-ext(1.0.0) executing CondaTestFlow for user:romain
    Resolving 2 environments ... done in 27 seconds.
### Environment for step start ###
DEFAULT Environment of type conda-only full hash 42a4ed94b63f12e1fe9dd29de21bf9ec6e271b1c:a3b104c4ce2215351a2b94076ef7827de3ad890a
Arch linux-64
Available on linux-64

Resolved on 2023-09-06 07:11:05.629324
Resolved by romain

Locally present as metaflow_42a4ed94b63f12e1fe9dd29de21bf9ec6e271b1c_a3b104c4ce2215351a2b94076ef7827de3ad890a

User-requested packages conda::boto3==>=1.14.0, conda::pandas==1.4.0,>=0.24.0, ...

Using Metaflow Command

metaflow environment resolve --python ">=3.8,<3.9" -r requirements.txt --force
Forcing re-resolution can result in different package versions being selected, potentially breaking reproducibility. Only use --force when you explicitly want to pick up newer package versions.

Resolution Process

1
Parse Requirements
2
Metaflow parses your decorators or requirements files to extract:
3
  • Package names and versions
  • Python version
  • Channels or sources
  • 4
    Compute Requirement ID
    5
    A hash is computed from the normalized requirements.
    6
    Check for Existing Environment
    7
    Metaflow checks if a default environment exists for this req_id:
    8
  • If yes and not forced: Use existing environment
  • If no or forced: Proceed to resolution
  • 9
    Resolve Dependencies
    10
    Metaflow uses the appropriate resolver:
    11
  • Pure Conda: mamba, micromamba, or conda
  • Pure PyPI: pip or uv
  • Mixed: conda-lock with poetry
  • 12
    Generate Full ID
    13
    All resolved packages are hashed to create the full_id.
    14
    Cache Resolution
    15
    The resolved environment description is saved:
    16
  • Locally: In conda_v2.cnd file
  • Remotely: In S3/Azure/GCS
  • Resolution Across Architectures

    You can resolve environments for multiple architectures simultaneously:
    metaflow environment resolve \
      --arch linux-64 \
      --arch osx-arm64 \
      --python ">=3.8,<3.9" \
      -r requirements.txt
    
    This allows you to:
    • Develop on Mac (osx-arm64)
    • Deploy to Linux servers (linux-64)
    • Ensure environment consistency across platforms
    Supported architectures: linux-64, osx-64, osx-arm64

    Using Remote Resolved Environments

    By default, Metaflow only uses locally resolved environments. You can configure it to check for remotely resolved environments:

    Configuration Options

    Set METAFLOW_CONDA_USE_REMOTE_LATEST to:
    Default behavior: Never use remote environments. Always re-resolve if not known locally.
    export METAFLOW_CONDA_USE_REMOTE_LATEST=":none:"
    

    Resolution Restrictions

    Be aware of these limitations when resolving environments:

    PyPI Packages

    Cross-architecture limitations: If a PyPI package is not available as a wheel, you cannot resolve across architectures.For example, resolving on Mac (osx-arm64) for Linux (linux-64) will fail if packages need to be built from source.
    Packages from:
    • Git repositories
    • Local directories
    • Source-only distributions
    require building on the target architecture.

    Mixed Mode

    No source builds: In mixed Conda+PyPI mode, non-wheel packages are not supported. This includes:
    • Git repositories
    • Local directories
    • Packages requiring compilation

    Environment Markers

    Environment markers in requirements files are not supported:
    # NOT SUPPORTED:
    pandas>=1.0; python_version >= "3.8"
    

    Viewing Resolved Environments

    After resolution, inspect what was resolved:

    For a Flow

    python myflow.py --environment=conda environment resolve
    

    For a Specific Step

    metaflow environment show --pathspec MyFlow/123/train
    

    For a Named Environment

    metaflow environment show my_team/my_env:v1
    
    Output includes:
    • Environment type (conda-only, pypi-only, mixed)
    • Requirement and full IDs
    • Resolution date and user
    • All installed packages with versions
    • Local presence status

    Caching After Resolution

    After resolving, Metaflow caches:
    1
    Environment Description
    2
    The JSON representation of the resolved environment is stored in S3/Azure/GCS.
    3
    Package Archives
    4
    All .conda or .tar.bz2 files are uploaded to cloud storage.
    6
    If packages were transmuted (e.g., .tar.bz2.conda), links between formats are stored.
    7
    Aliases
    8
    Any aliases (named environments) are registered and cached.
    Caching enables faster environment creation on remote nodes and sharing environments across team members.

    Next Steps

    Named Environments

    Create and share named environments

    CLI Reference

    Explore all environment commands

    Configuration

    Configure resolution behavior

    Troubleshooting

    Debug resolution issues

    Build docs developers (and LLMs) love