Skip to main content

Module 3: Training Workflows

Experiment tracking

Overview

This module focuses on building production-ready training workflows with proper project structure, experiment tracking, and configuration management. You’ll learn to implement both classic BERT-based models and modern LLM training pipelines.

What You’ll Learn

Project Structure

Python packaging, ML project templates, and code organization

Experiment Tracking

Track and manage ML experiments with W&B and other tools

Model Cards

Document models with standardized model cards and reporting

Classic Training

BERT-based text classification with HuggingFace Transformers

LLM Training

Fine-tune Phi-3 and other generative models with LoRA

Practice

Hands-on exercises and homework assignments

Key Topics

Training Infrastructure

  • Python project structure and packaging
  • Configuration management with Hydra and JSON configs
  • Experiment tracking with Weights & Biases
  • Model registry and versioning

Classic ML Training

  • BERT-based sequence classification
  • HuggingFace Transformers integration
  • Training metrics and evaluation
  • Model card generation

Modern LLM Training

  • Fine-tuning Phi-3 with LoRA/QLoRA
  • Parameter-efficient training techniques
  • Instruction-following datasets
  • GenAI-specific workflows

Testing & Quality

  • Code testing with pytest
  • Data validation with Great Expectations
  • Model behavioral testing
  • CI/CD integration

Reference Implementations

This module includes two complete reference implementations:
# BERT-based text classification
cd module-3/classic-example
make build
make test
python classic_example/cli.py train ./conf/example.json

Resources

Lightning-Hydra Template

ML project template with PyTorch Lightning and Hydra

Experiment Tracking Tools

Compare 15+ ML experiment tracking platforms

Model Cards

Original research paper on Model Cards for Model Reporting

Distributed Training

Scale training to multiple GPUs and nodes

Next Steps

1

Understand Project Structure

Learn how to organize ML projects and Python packages
2

Set Up Experiment Tracking

Configure W&B or Neptune for experiment logging
3

Train Classic Models

Run BERT-based training with the classic example
4

Fine-Tune LLMs

Train Phi-3 or similar models with LoRA
5

Complete Practice Tasks

Implement training workflows for your project

Build docs developers (and LLMs) love