Learn more about Mintlify

Enter your email to receive updates about new features and product releases.

Practice Exercises
Prerequisites
Homework 7: Kubeflow + Airflow Pipelines
Learning Objectives
Reading List
Task Requirements
Assignments
Success Criteria
Homework 8: Dagster
Learning Objectives
Reading List
Task Requirements
Assignments
Success Criteria
Bonus Challenges
Additional Resources
Tips for Success
Getting Help
Submission Checklist
What’s Next?

Practice Exercises

Complete these hands-on exercises to build end-to-end training and inference pipelines using all three orchestration frameworks.

Prerequisites

Before starting, ensure you have:

Kubernetes Cluster

kind create cluster --name ml-in-production

Environment Variables

export WANDB_PROJECT=your-project
export WANDB_API_KEY=your-api-key

Homework 7: Kubeflow + Airflow Pipelines

Learning Objectives

Deploy Kubeflow Pipelines on Kubernetes
Write training and inference DAGs for Kubeflow
Deploy Airflow with KubernetesPodOperator
Implement parallel training/inference pipelines

Reading List

Kubeflow Deployment

Standalone deployment guide

KFP SDK Reference

API documentation

KubernetesPodOperator

Airflow Kubernetes integration

Pipeline Design Pattern

DoorDash’s modular approach

Task Requirements

Both training and inference pipelines must include at minimum:

Training Pipeline
Inference Pipeline

Required Steps:

Load Training Data
Train Model
Save Trained Models

Optional Steps:

Data preprocessing/augmentation
Hyperparameter tuning
Model evaluation on validation set
Upload metrics to experiment tracker

Deliverables:

Trained model artifacts
Training metrics logged to W&B
Pipeline execution completes successfully

Assignments

PR1: Kubeflow Deployment README

Write a README with instructions on:

Installing Kubeflow Pipelines on Kind cluster
Accessing the UI via port-forward
Configuring the Python SDK
Verifying installation

Acceptance Criteria:

README is clear and reproducible
Includes troubleshooting common issues
Tested on a fresh cluster

PR2: Kubeflow Training Pipeline

Implement a Kubeflow training pipeline:

Use @dsl.component decorator
Define typed Input/Output artifacts
Upload model to W&B registry

Acceptance Criteria:

Pipeline compiles without errors
Runs successfully in Kubeflow UI
Produces trained model artifact
Training metrics logged

PR3: Kubeflow Inference Pipeline

Implement a Kubeflow inference pipeline:

Load model from W&B registry
Run predictions on test data
Save results as Dataset artifact

Acceptance Criteria:

Pipeline depends on training pipeline outputs
Artifact lineage visible in UI
Predictions saved correctly

PR4: Airflow Deployment README

Write a README covering:

Installing Airflow with Kubernetes provider
Creating PersistentVolumes for data sharing
Launching Airflow standalone
Accessing the web UI

Acceptance Criteria:

Instructions work on macOS and Linux
Explains AIRFLOW_HOME setup
Documents common errors

PR5: Airflow Training DAG

Implement an Airflow training DAG:

Use KubernetesPodOperator for tasks
Mount PersistentVolumes for data sharing
Clean up storage before/after runs

Acceptance Criteria:

DAG appears in Airflow UI
Triggers successfully via CLI or UI
Model uploaded to registry
Tasks run in correct sequence

PR6: Airflow Inference DAG

Implement an Airflow inference DAG:

Load data and model in parallel
Run inference after both complete
Schedule daily at 9 AM UTC

Acceptance Criteria:

Parallel task execution works
Schedule triggers automatically
Predictions saved to storage

Success Criteria

6 PRs merged with passing reviews
All pipelines run end-to-end without errors
Model training completes and uploads to registry
Inference generates predictions using trained models

Homework 8: Dagster

Learning Objectives

Implement asset-centric pipelines in Dagster
Add data quality checks with asset checks
Compare orchestration frameworks
Document tradeoffs in design decisions

Reading List

Dagster ML Pipelines

Orchestrating ML workflows

Fine-tuning LLMs

ML pipelines for LLM training

Metaflow

Alternative framework overview

Flyte

Another orchestration option

Task Requirements

Training Pipeline
Inference Pipeline

Required Assets:

load_training_data - Load and preprocess data
trained_model - Train model, return model artifact
model_metrics - Evaluate model on validation set

Required Checks:

Data is not empty
Model accuracy/metrics exceed threshold
Training completed without errors

Deliverables:

All assets materialize successfully
Asset checks pass (or fail with explanations)
Metadata visible in Dagster UI

Required Assets:

load_inference_data - Load data for predictions
load_trained_model - Fetch model from registry
predictions - Run inference, return results
inference_metrics - Calculate prediction statistics

Required Checks:

Inference data not empty
Model loaded successfully
Predictions within expected ranges

Deliverables:

Predictions saved to file/database
Inference metrics logged
Asset lineage shows dependencies

Assignments

Update Design Document

Add a Pipeline Orchestration section to your Google Doc comparing:For Each Framework:

Why did you choose this framework?
What are the advantages for your use case?
What are the limitations?
How does it handle failures?
What’s the learning curve?

Comparison Table: Create a table comparing Airflow, Kubeflow, and Dagster on:

Ease of use
Kubernetes integration
Artifact tracking
Data quality checks
Community support
Production readiness

Recommendation: Which framework would you choose for production and why?

PR1: Dagster Training Pipeline

Implement Dagster assets for training:

Use @asset decorator
Add @asset_check for validation
Attach metadata with context.add_output_metadata()
Optionally use Modal for GPU execution

Acceptance Criteria:

Assets materialize in Dagster UI
Asset checks run and report status
Metadata includes samples, metrics, counts
Model uploaded to registry

PR2: Dagster Inference Pipeline

Implement Dagster assets for inference:

Depend on trained model asset
Load model from registry
Run batch predictions
Add checks for prediction quality

Acceptance Criteria:

Asset lineage shows training → inference flow
Predictions saved successfully
Asset checks validate output
Inference metrics logged

Success Criteria

2 PRs merged with passing reviews
Pipeline section in design document
All assets materialize without errors
Asset checks provide useful validation
Clear recommendation for production use

Bonus Challenges

Multi-Model Comparison

Extend pipelines to train multiple models in parallel:

Train 3+ models with different hyperparameters
Compare metrics in W&B
Select best model for inference
Implement A/B testing in inference pipeline

Advanced Scheduling

Implement complex scheduling logic:

Retrain model weekly
Run inference hourly
Trigger retraining if inference drift detected
Send Slack notifications on failures

Cost Optimization

Reduce training costs:

Use spot instances for training
Implement early stopping
Cache intermediate results
Compare costs across orchestrators
Document savings (aim for 50%+ reduction)

Reference: How we Reduced ML Training Costs by 78%

Data Versioning

Add data versioning and lineage:

Use DVC or Pachyderm for data versioning
Track which data version trained each model
Enable rollback to previous data/model versions
Implement drift detection on training data

Additional Resources

Why Data Scientists Shouldn't Know K8s

Chip Huyen’s perspective

MLOps Orchestration

Made With ML course

Awesome Workflow Engines

Comprehensive comparison

Tips for Success

Start Simple

Begin with minimal pipelines (3-4 steps) before adding complexity.

Test Locally First

Run components locally before deploying to Kubernetes.

Use Version Control

Commit pipeline code frequently with clear messages.

Document Everything

Write READMEs as you go, not at the end.

Compare Thoughtfully

In your design doc, provide specific examples rather than generic statements.

Getting Help

If you’re stuck:

Check the framework’s documentation
Search GitHub issues for similar problems
Review example DAGs/pipelines in this module
Ask in course discussion forums
Consult with your peers

Remember: The goal is learning, not perfection. It’s okay if your first pipelines are messy—refactor as you learn!

Submission Checklist

Homework 7 Checklist

Homework 8 Checklist

PR1: Dagster training pipeline
PR2: Dagster inference pipeline
Design doc pipeline section completed
Comparison table filled out
Recommendation documented
All asset checks implemented
Metadata attached to assets
Asset lineage visible in UI

What’s Next?

After completing these exercises, you’ll be ready to:

Deploy production ML pipelines
Choose appropriate orchestration tools for projects
Implement data quality checks and monitoring
Scale ML workflows on Kubernetes
Compare and evaluate orchestration frameworks

Continue to Module 5

Move on to the next module

Dagster Pipelines

Module 5: Model Serving

⌘I

Build docs developers (and LLMs) love

Get started for free Talk to us

Module 1: Infrastructure

Module 2: Data Management

Module 3: Training Workflows

Module 4: Pipeline Orchestration

Module 5: Model Serving

Module 6: Optimization

Module 7: Monitoring

Module 8: Cloud Platforms

​Practice Exercises

​Prerequisites

Kubernetes Cluster

Environment Variables

​Homework 7: Kubeflow + Airflow Pipelines

​Learning Objectives

​Reading List

Kubeflow Deployment

KFP SDK Reference

KubernetesPodOperator

Pipeline Design Pattern

​Task Requirements

​Assignments

​Success Criteria

​Homework 8: Dagster

​Learning Objectives

​Reading List

Dagster ML Pipelines

Fine-tuning LLMs

Metaflow

Flyte

​Task Requirements

​Assignments

​Success Criteria

​Bonus Challenges

​Additional Resources

Why Data Scientists Shouldn't Know K8s

MLOps Orchestration

Awesome Workflow Engines

​Tips for Success

​Getting Help

​Submission Checklist

​What’s Next?

Continue to Module 5

Build docs developers (and LLMs) love

Practice Exercises

Prerequisites

Homework 7: Kubeflow + Airflow Pipelines

Learning Objectives

Reading List

Task Requirements

Assignments

Success Criteria

Homework 8: Dagster

Learning Objectives

Reading List

Task Requirements

Assignments

Success Criteria

Bonus Challenges

Additional Resources

Tips for Success

Getting Help

Submission Checklist

What’s Next?