Why Scale?
Data science workflows often need more resources than a laptop can provide:- Large datasets that don’t fit in local memory
- Compute-intensive operations like training ML models
- Parallel processing across multiple machines
- GPU acceleration for deep learning
- Long-running jobs that need dedicated infrastructure
Scaling Approaches
Metaflow provides multiple ways to scale your workflows:Remote Execution
Run individual steps on cloud compute while keeping control local
AWS Batch
Execute steps on AWS Batch for scalable, managed compute
Kubernetes
Run steps on Kubernetes clusters for container-based orchestration
Distributed Computing
Coordinate multi-node jobs for parallel and distributed workloads
Key Concepts
Decorators for Compute
Metaflow uses Python decorators to specify compute requirements:Portable Resource Specs
The@resources decorator lets you specify requirements independently of the compute platform:
Hybrid Execution
You can mix local and remote execution in the same flow:Platform Support
| Feature | AWS Batch | Kubernetes | Local |
|---|---|---|---|
| CPU control | ✓ | ✓ | ✗ |
| Memory control | ✓ | ✓ | ✗ |
| GPU support | ✓ | ✓ | ✗ |
| Disk size | Limited | ✓ | ✗ |
| Multi-node | ✓ | ✓ | ✓ |
| Auto-scaling | ✓ | ✓ | ✗ |
Getting Started
Configure your environment
Set up AWS credentials or Kubernetes access. See platform-specific guides:
Best Practices
Start small and scale up
Start small and scale up
Develop and test locally first. Add compute decorators only to steps that need them. This keeps development fast and costs low.
Use @resources for portability
Use @resources for portability
Specify requirements with
@resources instead of platform-specific parameters. This makes it easy to switch between AWS Batch and Kubernetes.Right-size your resources
Right-size your resources
Monitor actual usage and adjust CPU, memory, and GPU allocations. Over-provisioning wastes money; under-provisioning causes failures.
Leverage data locality
Leverage data locality
Keep data close to compute. Use S3 with AWS Batch, appropriate storage with Kubernetes. Metaflow handles data movement automatically.
Next Steps
Remote Execution
Learn about running steps remotely
Resources Decorator
Deep dive into @resources options
AWS Batch
Set up AWS Batch integration
Kubernetes
Configure Kubernetes execution
