Basic Step Structure
A step is created using the@step decorator:
Creating Your First Step
@step
def load_data(data_path: str) -> Annotated[dict, "dataset"]:
"""Load data from a file.
Args:
data_path: Path to the data file
Returns:
Loaded dataset as a dictionary
"""
# Your data loading logic
data = {"values": [1, 2, 3, 4, 5]}
return data
@step
def process_data(
dataset: dict,
multiplier: int = 2
) -> Annotated[list, "processed_data"]:
"""Process the dataset.
Args:
dataset: Input dataset dictionary
multiplier: Value to multiply each item by
Returns:
Processed list of values
"""
values = dataset["values"]
processed = [x * multiplier for x in values]
return processed
Step Inputs and Outputs
Type Annotations
UseAnnotated to give artifacts meaningful names:
Multiple Outputs
Return multiple artifacts from a step:Optional Outputs
UseOptional for conditional outputs:
Step Parameters
Make steps configurable with parameters:Step Configuration
Resource Settings
Specify compute resources for a step:Disabling Cache
Disable caching for specific steps:Step Operators
Run steps on different infrastructure:Best Practices for Steps
Keep Steps Focused
Each step should have a single, clear purpose:Use Meaningful Names
Choose descriptive names for steps and artifacts:Add Comprehensive Docstrings
Handle Errors Gracefully
Common Step Patterns
Data Loading Step
Model Training Step
Model Evaluation Step
Next Steps
Step Context
Access runtime information and metadata within steps
Artifact Management
Learn how ZenML tracks and manages step artifacts
Creating Pipelines
Connect steps together in pipelines
