Python Environment
Requirements
All modules in this course require Python 3.10 or higher. Some modules may require Python 3.12+ for specific features.Verify your Python version
Virtual Environment Setup
Each module is self-contained and should use its own virtual environment to avoid dependency conflicts.- Using venv (Standard)
- Using uv (Recommended)
- Using conda
Core Dependencies
Module 3: Training Workflows
The classic example module requires transformer libraries and experiment tracking tools:requirements.txt
What each package does
What each package does
- transformers: Hugging Face library for BERT, GPT, and other transformer models
- datasets: Easy loading and processing of ML datasets
- accelerate: Simplifies distributed training across GPUs
- typer: Build CLI applications with type hints
- wandb: Experiment tracking and model registry
- ruff: Fast Python linter and formatter (replaces flake8, black, isort)
- great-expectations: Data validation and testing
- pytest-cov: Code coverage reporting for tests
Module 5: Model Serving
For deploying models as APIs:fastapi- Modern web framework for building APIsuvicorn- ASGI server for running FastAPI appspydantic- Data validation using Python type hintsstreamlit- Build interactive web UIs for models
Development Tools
Code Formatting and Linting
This repository uses Ruff for code quality. It’s 10-100x faster than traditional tools.Testing
Run tests using pytest:Experiment Tracking
Weights & Biases Setup
Most training examples integrate with W&B for tracking experiments, metrics, and model artifacts.Create W&B account
Sign up for free at wandb.ai
Get your API key
Visit wandb.ai/authorize to get your API key
W&B is optional for local development but highly recommended for tracking experiments across multiple runs.
Container Tools (Module 1)
Docker Installation
Required for containerization and Kubernetes modules.- macOS
- Linux
- Windows
Kubernetes Local Setup
For running Kubernetes examples locally, install kind (Kubernetes in Docker):Serverless Platforms (Optional)
Modal Setup
Modal provides serverless GPU compute for training and inference.Modal offers generous free tier credits for experimentation. Perfect for running GPU-intensive training jobs without local hardware.
GPU Setup (Optional)
CUDA for Local GPU Training
If you have an NVIDIA GPU and want to train locally:Install PyTorch with CUDA
Visit pytorch.org for the correct command, or use:
Editor Setup
VS Code (Recommended)
Install these extensions for the best experience:- Python (ms-python.python) - IntelliSense, debugging, linting
- Ruff (charliermarsh.ruff) - Fast linting and formatting
- Jupyter (ms-toolsai.jupyter) - Notebook support
- Docker (ms-azuretools.vscode-docker) - Container management
- YAML (redhat.vscode-yaml) - Kubernetes YAML validation
settings.json
Verify Your Setup
Run this checklist to ensure everything is configured correctly:Setup verification checklist
Setup verification checklist
Common Issues
ImportError: No module named 'transformers'
ImportError: No module named 'transformers'
Make sure your virtual environment is activated:
CUDA out of memory
CUDA out of memory
Reduce batch size or maximum sequence length in your training config:
conf/example.json
Docker permission denied
Docker permission denied
Add your user to the docker group:
Port 8080 already in use
Port 8080 already in use
Find and kill the process using the port:
Next Steps
Quickstart Guide
Train and deploy your first model in 10 minutes
Module 1: Infrastructure
Start with containerization and Kubernetes basics
Module 3: Training
Learn training workflows and experiment tracking
Browse All Modules
Explore all 8 course modules