Overview
This practice module combines all concepts from Module 1 into real-world tasks. You’ll draft an ML system design, containerize applications, build CI/CD pipelines, and deploy to Kubernetes.These exercises mirror real production scenarios. Take your time, experiment, and don’t hesitate to reference the documentation pages.
Two-Part Structure
Module 1 practice is divided into two major components:- H1: Initial Design Draft - Plan your ML system architecture
- H2: Infrastructure - Implement containerization and deployment
H1: Initial Design Draft
Before writing any code, design your ML system. This is critical for production systems where poor architecture leads to technical debt.Reading List
Core Reading
- Ml-design-docs - Templates and examples
- How to Write Design Docs for Machine Learning Systems - Comprehensive guide
- Design Docs at Google - Industry practices
Technical Debt & Testing
- The ML Test Score: A Rubric for ML Production Readiness
- datascience-fails - Common pitfalls
Practical Resources
Task: Write Your Design Document
Create a comprehensive design document for an ML system using the MLOps template. You can use a real system from your work or create a fictional but realistic one. Your design doc must cover:- Architecture
- Operations
- Planning
- Assessment
Models in Production
- What models are deployed?
- How do they interact?
- Data flow and dependencies
- Architecture strengths
- Known limitations
- Trade-offs made
Reference Design Example
See this example design document for inspiration.Acceptance Criteria
- ✅ Approve - Document thoroughly addresses all required sections
- ❌ No approval - Missing critical sections or insufficient detail
H2: Infrastructure
Now implement the infrastructure for your ML system using Docker, Kubernetes, and CI/CD.Reading List
Docker Fundamentals
- 0 to production-ready: Docker packaging best practices - Video
- Docker and Python: Data Science and ML - Video
- Docker introduction - Tutorial
- Overview of Docker Hub - Registry guide
CI/CD
- Introduction to GitHub Actions
- Course: CI/CD for Machine Learning (GitOps) - Free W&B course
Kubernetes
- Learn Kubernetes Basics - Official tutorial
- Hello Minikube - Quick start
- Kind Quick Start - Local clusters
- Book: Kubernetes in Action - Comprehensive guide
Advanced Topics
Task Breakdown
Complete these three pull requests to your repository:PR1: Dockerfile and Registry
Create a Dockerfile for a simple ML application and push to a container registry.Requirements:Verification:
- Write a Dockerfile with a basic web server or ML script
- Push image to GitHub Container Registry (ghcr.io) or Docker Hub
- Tag image appropriately (e.g.,
v1.0.0,latest)
PR2: GitHub Actions CI/CD
Create a GitHub Actions workflow that builds and pushes your Docker image on every PR.Requirements:Verification:
- Workflow triggers on pull requests
- Builds Docker image
- Runs basic tests (if applicable)
- Pushes to container registry
- Displays green checkmark on success
- Check the “Actions” tab in GitHub
- Ensure workflow completes successfully
- Verify image appears in GitHub Packages
PR3: Kubernetes Manifests
Write Kubernetes YAML definitions and test on a local cluster.Requirements:
Create manifests for:Example Deployment:Testing with kind:
- Pod: Single container instance
- Deployment: Replicated application
- Service: Network access to deployment
- Job: One-time batch task
Bonus: Install k9s
Install k9s for a better Kubernetes management experience::pods- View pods:deploy- View deployments:svc- View services:jobs- View jobsl- View logsd- Describe resourcectrl-d- Delete resource
k9s dramatically improves Kubernetes debugging productivity. It’s the first tool experienced developers install.
Acceptance Criteria
✅ Pass - All three PRs meet requirements:-
PR1:
- Dockerfile builds successfully
- Image pushed to registry
- Image runs correctly when pulled
-
PR2:
- GitHub Actions workflow exists
- Workflow runs on PRs
- All jobs complete successfully (green checkmark)
-
PR3:
- All four resource types defined (Pod, Deployment, Service, Job)
- Resources deploy to kind/minikube successfully
- Application accessible via Service
Tips and Common Issues
Docker Troubleshooting
- Build Failures
- Registry Auth
- Image Size
Kubernetes Troubleshooting
- Pod Won't Start
- Image Pull Errors
- Service Not Accessible
GitHub Actions Troubleshooting
- Workflow Won't Run
- Permission Denied
- Secrets Not Available
- Check trigger conditions (branch names, paths)
- Verify YAML syntax (use VS Code extension)
- Look for typos in
on:triggers
Example Repository Structure
Submission Checklist
Before marking this module complete, ensure:- Design document covers all required sections
- Design includes ML Test Score assessment
- Design identifies potential failure modes
- Design connects to business metrics
- PR1: Dockerfile committed and image in registry
- PR2: GitHub Actions workflow runs successfully
- PR3: All Kubernetes manifests deploy successfully
- All three PRs merged to main branch
- CI/CD pipeline shows green status
- k9s tool installed (optional but recommended)
Additional Practice Ideas
Want to go deeper? Try these extensions:- Multi-environment setup: Create separate namespaces for dev/staging/prod
- Secrets management: Use Kubernetes Secrets for API keys
- Monitoring: Add Prometheus/Grafana for metrics
- GitOps: Implement ArgoCD for declarative deployments
- Helm charts: Package your application as a Helm chart
- Integration tests: Add end-to-end tests to CI/CD
- Canary deployments: Implement gradual rollouts
Resources
All reading materials mentioned above, plus:- Module 1 Overview - Refresh core concepts
- Docker Documentation - Container deep dive
- Kubernetes Documentation - Orchestration details
- CI/CD Documentation - Pipeline patterns
- Serverless Alternatives - Optional simpler approaches
Getting Help
If you’re stuck:- Check logs: Most issues reveal themselves in logs
- Search GitHub Issues: Others have likely hit the same problem
- Use k9s: Visual debugging is often faster
- Start simple: Get basic version working, then add complexity
- Ask for help: Share your error messages and what you’ve tried
Production infrastructure is complex. It’s normal to encounter issues. Each problem you solve teaches you something valuable.
Next Steps
Congratulations on completing Module 1! You now understand:- How to containerize ML applications
- Kubernetes orchestration fundamentals
- CI/CD automation with GitHub Actions
- Serverless alternatives for simpler deployments
- How to design production ML systems