Module 3: Training Workflows

Overview
This module focuses on building production-ready training workflows with proper project structure, experiment tracking, and configuration management. You’ll learn to implement both classic BERT-based models and modern LLM training pipelines.What You’ll Learn
Project Structure
Python packaging, ML project templates, and code organization
Experiment Tracking
Track and manage ML experiments with W&B and other tools
Model Cards
Document models with standardized model cards and reporting
Classic Training
BERT-based text classification with HuggingFace Transformers
LLM Training
Fine-tune Phi-3 and other generative models with LoRA
Practice
Hands-on exercises and homework assignments
Key Topics
Training Infrastructure
- Python project structure and packaging
- Configuration management with Hydra and JSON configs
- Experiment tracking with Weights & Biases
- Model registry and versioning
Classic ML Training
- BERT-based sequence classification
- HuggingFace Transformers integration
- Training metrics and evaluation
- Model card generation
Modern LLM Training
- Fine-tuning Phi-3 with LoRA/QLoRA
- Parameter-efficient training techniques
- Instruction-following datasets
- GenAI-specific workflows
Testing & Quality
- Code testing with pytest
- Data validation with Great Expectations
- Model behavioral testing
- CI/CD integration
Reference Implementations
This module includes two complete reference implementations:Resources
Lightning-Hydra Template
ML project template with PyTorch Lightning and Hydra
Experiment Tracking Tools
Compare 15+ ML experiment tracking platforms
Model Cards
Original research paper on Model Cards for Model Reporting
Distributed Training
Scale training to multiple GPUs and nodes