Overview
Metaflow provides several ways to manage Python dependencies:- Conda environments for reproducible package sets
- PyPI packages for individual package installation
- Docker images for complete environment control
- Custom images for specialized requirements
Conda Environments
Using @conda
The@conda decorator creates isolated Conda environments for steps:
Conda Base Image
Specify a base Conda environment:Conda Channels
PyPI Packages
Using @pypi
Install packages from PyPI:PyPI with Index URLs
Installing from Git
Docker Images
Using @conda_base
Specify a custom Docker image with Conda:Custom Docker Images
Build a custom Docker image:UV Package Manager
Metaflow supports the modern UV package manager for faster dependency resolution:Using @pypi with UV
Benefits of UV
- 10-100x faster than pip for dependency resolution
- Better conflict resolution
- Reproducible installs with lock files
- Compatible with pip workflows
Environment Configuration
requirements.txt
Use a requirements file:requirements.txt
environment.yml
Use a Conda environment file:environment.yml
Multi-Step Dependencies
Different Dependencies per Step
Cloud Execution
AWS Batch with Dependencies
Kubernetes with Dependencies
Best Practices
Pin package versions
Pin package versions
Always specify exact versions for reproducibility:
Use Conda for scientific packages
Use Conda for scientific packages
Conda handles complex dependencies better for scientific packages:
Test locally first
Test locally first
Test dependency installation locally before cloud execution:
Use UV for faster installs
Use UV for faster installs
Enable UV for faster package installation:
Cache Docker images
Cache Docker images
Use a container registry to cache images:
Common Patterns
ML Training Environment
Data Science Notebook
Bioinformatics Pipeline
Troubleshooting
Package conflicts
Package conflicts
Use Conda instead of pip for conflicting packages:
Long installation times
Long installation times
Enable UV or use pre-built Docker images:
Binary compatibility issues
Binary compatibility issues
Use Conda for packages with complex binary dependencies:
Related Topics
Environment Decorator
Setting environment variables
AWS Batch
Running with custom images on Batch
Kubernetes
Using custom images on Kubernetes
Docker Images
Building and using Docker images
