Mixed Conda and PyPI Environments

Mixed environments allow you to combine packages from both Conda and PyPI repositories, giving you access to the full ecosystem of Python packages while leveraging Conda’s ability to manage non-Python dependencies.

Overview

Mixed mode uses conda-lock (via Poetry) to resolve dependencies from both ecosystems. This is useful when:

You need packages only available in one ecosystem
You want Conda’s superior handling of complex dependencies (e.g., TensorFlow with CUDA)
You need non-Python system libraries
You want reproducible environments across both package managers

Mixed mode is slower than pure PyPI or pure Conda resolution due to the complexity of cross-ecosystem dependency resolution.

Using Decorators

Combine @conda and @pypi decorators on the same step:

from metaflow import FlowSpec, step, conda, pypi

class MixedEnvFlow(FlowSpec):
    
    @conda(libraries={"numpy": "1.21.5"}, python=">=3.8,<3.9")
    @pypi(packages={"tensorflow": "2.7.4"})
    @step
    def start(self):
        import numpy as np
        import tensorflow as tf
        print(f"NumPy {np.__version__} from Conda")
        print(f"TensorFlow {tf.__version__} from PyPI")
        self.next(self.end)
        
    @step
    def end(self):
        print("Done")

if __name__ == "__main__":
    MixedEnvFlow()

Flow-Level Base Environments

Use @conda_base and @pypi_base for dependencies shared across all steps:

@conda_base(libraries={"numpy": "1.21.5"}, python=">=3.8,<3.9")
@pypi_base(packages={"requests": "2.28.0"})
class MixedEnvFlow(FlowSpec):
    
    @pypi(packages={"pandas": "1.5.0"})
    @step
    def start(self):
        # Has numpy (from conda_base), requests (from pypi_base),
        # and pandas (from step decorator)
        import numpy, requests, pandas
        self.next(self.end)
    
    @conda(libraries={"scipy": "1.9.0"})
    @step
    def end(self):
        # Has numpy, scipy, and requests
        # Step-level overrides flow-level
        import numpy, scipy, requests

if __name__ == "__main__":
    MixedEnvFlow()

Step-level decorators override flow-level decorators. If you specify python="3.9" at the step level, it overrides the flow-level Python version.

Using environment.yml

The environment.yml format provides a more structured way to specify mixed dependencies:

env_mixed.yml

channels:
  - conda-forge
  - defaults

dependencies:
  - pandas>=1.0.0
  - numpy=1.21.5
  - python>=3.8,<3.9
  - pip:
    - tensorflow==2.7.4
    - apache-airflow[aiobotocore]

Resolve with:

metaflow environment resolve -f env_mixed.yml

Syntax Details

Channels
Conda Dependencies
PyPI Dependencies
PyPI Indices

Specify Conda channels to search:

channels:
  - conda-forge
  - defaults
  - my-company-channel

Channels are searched in order. Use channel priority:

channels:
  - conda-forge  # Highest priority
  - defaults

List Conda packages with version constraints:

dependencies:
  - python=3.8
  - numpy=1.21.5
  - pandas>=1.0.0,<2.0.0
  - scikit-learn  # Any version
  - conda-forge::pytorch  # From specific channel

Version operators:

= or == - Exact version
>=, <=, >, < - Comparisons
>=1.0,<2.0 - Combined constraints

Nest PyPI packages under pip::

dependencies:
  - python=3.8
  - numpy=1.21.5
  - pip:
    - tensorflow==2.7.4
    - torch==1.12.0
    - apache-airflow[aiobotocore]

In mixed mode, PyPI packages cannot be from Git repositories or local directories. Only wheels and source tarballs from PyPI indices are supported.

Add custom PyPI indices:

channels:
  - conda-forge

pypi-indices:
  - https://my-company.com/pypi/simple
  - https://my-private-pypi.org/simple

dependencies:
  - python=3.8
  - pip:
    - my-private-package==1.0.0

Using conda-lock

Under the hood, mixed environments use conda-lock with Poetry. The resolution process:

Generate TOML Configuration

Metaflow converts your requirements into a pyproject.toml file that conda-lock understands.

Resolve with conda-lock

conda-lock calls both Conda and Poetry to resolve the full dependency tree, ensuring compatibility between both ecosystems.

Generate Explicit Specification

The result is an explicit list of all packages with exact versions and download URLs.

Build Non-Wheel Packages

Any PyPI source packages are built into wheels and cached for reuse.

Requirements

To use mixed environments, your environment needs:

conda install conda-lock>=2.1.0

conda-lock is not required on remote execution nodes—only on the machine where you resolve environments.

Restrictions and Limitations

Mixed mode has several important restrictions due to conda-lock limitations:

Package Source Restrictions

Not Supported

Git repositories
Local directories
Editable packages
Non-wheel builds

Supported

PyPI wheels
Source tarballs from PyPI
Packages with extras
All Conda packages

Example of Unsupported Packages

env_invalid.yml

dependencies:
  - python=3.8
  - numpy=1.21.5
  - pip:
    # ❌ This will FAIL - Git repos not supported in mixed mode
    - my-package @ git+https://github.com/user/repo.git@main
    # ❌ This will FAIL - Local directories not supported
    - local-pkg @ file:///path/to/local/package
    # ✅ This works - regular PyPI package
    - tensorflow==2.7.4

If you need Git repositories or local packages, use pure PyPI mode instead (see PyPI Packages).

Channel Priority

When using channels with :: notation or extra channels, Metaflow sets flexible channel priority:

env_channels.yml

channels:
  - conda-forge
  - pytorch
  - defaults

dependencies:
  - python=3.8
  - numpy=1.21.5
  - pytorch::pytorch=1.12.0  # Forces pytorch channel
  - cudatoolkit>=11.0

The :: notation (e.g., pytorch::pytorch) forces a package to come from a specific channel, overriding channel priority.

Virtual Packages

Metaflow automatically includes system virtual packages for Linux:

subdirs:
  linux-64:
    packages:
      __glibc: "2.27"
      __cuda: "11.2"

You can override these using --sys-pkg in requirements.txt or the sys: section in environment.yml.

Performance Considerations

Resolution Time
When to Use Mixed
Optimization Tips

Mixed environments are slower to resolve:

Pure PyPI: ~10-30 seconds
Pure Conda: ~30-60 seconds
Mixed: ~60-120 seconds

This is due to conda-lock coordinating between two ecosystems.

Complete Example

Here’s a real-world mixed environment for ML workloads:

env_ml_mixed.yml

channels:
  - conda-forge
  - defaults

pypi-indices:
  - https://download.pytorch.org/whl/cu116

dependencies:
  - python=3.9
  # Scientific computing from Conda (optimized builds)
  - numpy=1.23.0
  - scipy=1.9.0
  - pandas=1.4.0
  # CUDA from Conda
  - cudatoolkit=11.6
  # Deep learning from PyPI
  - pip:
    - torch==1.12.0+cu116
    - torchvision==0.13.0+cu116
    - transformers==4.20.0
    - wandb==0.13.0

Resolve and create an alias:

metaflow environment resolve \
  -f env_ml_mixed.yml \
  --alias mlp/ml-team/torch-cuda:v1

Then use in your flow:

from metaflow import FlowSpec, step, named_env

class MLFlow(FlowSpec):
    
    @named_env(name="mlp/ml-team/torch-cuda:v1")
    @step
    def train(self):
        import torch
        import transformers
        print(f"CUDA available: {torch.cuda.is_available()}")
        self.next(self.end)
    
    @step
    def end(self):
        pass

Get Started

Conda v2

Debug Extension

Guides

Overview

Using Decorators

Flow-Level Base Environments

Using environment.yml

Syntax Details

Using conda-lock

Requirements

Restrictions and Limitations

Package Source Restrictions

Not Supported

Supported

Example of Unsupported Packages

Channel Priority

Virtual Packages

Performance Considerations

Complete Example

Build docs developers (and LLMs) love

Get Started

Conda v2

Debug Extension

Guides

​Overview

​Using Decorators

​Flow-Level Base Environments

​Using environment.yml

​Syntax Details

​Using conda-lock

​Requirements

​Restrictions and Limitations

​Package Source Restrictions

Not Supported

Supported

​Example of Unsupported Packages

​Channel Priority

​Virtual Packages

​Performance Considerations

​Complete Example

Build docs developers (and LLMs) love

Overview

Using Decorators

Flow-Level Base Environments

Using environment.yml

Syntax Details

Using conda-lock

Requirements

Restrictions and Limitations

Package Source Restrictions

Example of Unsupported Packages

Channel Priority

Virtual Packages

Performance Considerations

Complete Example