Basic Pipeline Structure
A ZenML pipeline is created using the@pipeline decorator:
Creating Your First Pipeline
@step
def load_data() -> Annotated[dict, "dataset"]:
"""Load data from a source."""
return {"data": [1, 2, 3, 4, 5]}
@step
def process_data(dataset: dict) -> Annotated[list, "processed"]:
"""Process the loaded data."""
return [x * 2 for x in dataset["data"]]
@step
def save_results(processed: list) -> None:
"""Save the processed results."""
print(f"Results: {processed}")
@pipeline
def data_processing_pipeline():
"""Pipeline that loads, processes, and saves data."""
dataset = load_data()
processed = process_data(dataset)
save_results(processed)
Pipeline Parameters
You can make pipelines configurable by adding parameters:Pipeline Return Values
Pipelines can return artifacts that are tracked by ZenML:Advanced Pipeline Patterns
Conditional Execution
Use Python conditionals to control pipeline flow:Multiple Outputs
Steps can return multiple artifacts:Nested Pipelines
Compose complex workflows by calling pipelines within pipelines:Pipeline Configuration
Configure pipeline behavior with settings:Best Practices
Keep Pipelines Focused
Each pipeline should have a clear, single purpose (training, inference, data processing, etc.)
Use Type Hints
Always use type hints and
Annotated for step inputs/outputs to enable proper artifact trackingMake Pipelines Parameterizable
Use pipeline parameters instead of hardcoding values for flexibility
Document Your Pipelines
Add clear docstrings explaining what the pipeline does and what parameters it accepts
Running Pipelines
There are multiple ways to run a pipeline:Next Steps
Deploying Pipelines
Learn how to deploy pipelines to production environments
Scheduling Pipelines
Set up automated pipeline runs on a schedule
Writing Steps
Learn how to write effective pipeline steps
Artifact Management
Understand how ZenML tracks and manages artifacts
