Understanding Orchestrators
Orchestrators manage the execution flow of your pipelines. They handle:- Submitting pipeline runs to execution backends
- Managing step dependencies and execution order
- Providing unique run identifiers for tracking
- Supporting both static and dynamic pipeline execution
BaseOrchestrator Interface
All orchestrators inherit fromBaseOrchestrator and must implement the abstract method get_orchestrator_run_id().
Static vs Dynamic Pipelines
ZenML supports two execution modes: Static pipelines: The complete DAG (directed acyclic graph) is known before execution begins. Steps are submitted individually to the backend. Dynamic pipelines: The DAG can change during execution. An orchestration container runs first to determine which steps to execute.Implementing Your Orchestrator
Step 1: Create the Configuration Class
Define configuration options for your orchestrator:Step 2: Implement the Orchestrator Class
For containerized orchestrators (most cloud platforms), inherit fromContainerizedOrchestrator:
Step 3: Create the Flavor Class
The Tricky Part: get_orchestrator_run_id()
This method is where most implementers struggle. Let’s understand why it’s needed and how to implement it correctly.Why It’s Needed
In static pipelines, steps start executing immediately without a central coordinator. The first step to execute needs to:- Create a pipeline run in ZenML’s database
- Share the run ID with all other steps
Key Requirements
Same for All Steps
Every step in a run MUST get the exact same ID. Don’t generate a new ID per step.
Unique Per Run
Different pipeline executions MUST get different IDs. Don’t return a fixed string.
From Backend
Use an ID provided by your orchestration backend (job ID, execution ARN, etc.).
Size Limit
Must be under ~250 characters due to database column constraints.
Implementation Patterns
For Kubernetes-based orchestrators:Handling Resource Settings
Orchestrators should respect resource requirements from step configurations:Testing Your Orchestrator
Create a simple test pipeline:Common Pitfalls
Non-Unique IDs
Returning a fixed string or timestamp that could collide across runs
Different IDs Per Step
Generating a new ID for each step instead of sharing one ID across all steps
Missing Environment Variables
Not setting the orchestrator run ID in the environment when launching step containers
Ignoring Resource Settings
Not reading cpu_count, memory, and gpu_count from step configurations
Reference Implementations
Study these real orchestrators for guidance:- Kubernetes Orchestrator:
src/zenml/integrations/kubernetes/orchestrators/kubernetes_orchestrator.py - AWS SageMaker Orchestrator:
src/zenml/integrations/aws/orchestrators/sagemaker_orchestrator.py - GCP Vertex Orchestrator:
src/zenml/integrations/gcp/orchestrators/vertex_orchestrator.py
Next Steps
Resource Configuration
Learn about CPU, memory, and GPU settings
Containerization
Understand Docker image building and management
Dynamic Pipelines
Support pipelines with runtime-determined DAGs
Custom Materializers
Handle custom data types in your pipelines
