Setup

This guide covers everything you need to set up a robust development environment for machine learning in production.

Python Environment

Requirements

All modules in this course require Python 3.10 or higher. Some modules may require Python 3.12+ for specific features.

Verify your Python version

python3 --version

You should see Python 3.10.0 or higher. If not, download the latest Python.

Install uv (recommended)

For modern Python package management, install uv - a fast Python package installer:

curl -LsSf https://astral.sh/uv/install.sh | sh

Alternatively, use pip:

pip install uv

uv is significantly faster than pip and handles dependency resolution more reliably. Several modules (like module-1) use uv by default.

Virtual Environment Setup

Each module is self-contained and should use its own virtual environment to avoid dependency conflicts.

Using venv (Standard)
Using uv (Recommended)
Using conda

# Create virtual environment
python3 -m venv venv

# Activate on Linux/macOS
source venv/bin/activate

# Activate on Windows
venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Create and activate virtual environment
uv venv
source .venv/bin/activate  # Linux/macOS
# or .venv\Scripts\activate on Windows

# Install dependencies
uv pip install -r requirements.txt

# Create environment
conda create -n ml-in-production python=3.10
conda activate ml-in-production

# Install dependencies
pip install -r requirements.txt

Always activate your virtual environment before installing packages or running code. This prevents system-wide package conflicts.

Core Dependencies

Module 3: Training Workflows

The classic example module requires transformer libraries and experiment tracking tools:

requirements.txt

transformers==4.42.3
datasets==2.15.0
accelerate==0.32.1
typer==0.6.1
wandb==0.17.4
ruff==0.5.0
great-expectations==0.15.25
pytest-cov==3.0.0

Install with:

cd module-3/classic-example
pip install -r requirements.txt

What each package does

transformers: Hugging Face library for BERT, GPT, and other transformer models
datasets: Easy loading and processing of ML datasets
accelerate: Simplifies distributed training across GPUs
typer: Build CLI applications with type hints
wandb: Experiment tracking and model registry
ruff: Fast Python linter and formatter (replaces flake8, black, isort)
great-expectations: Data validation and testing
pytest-cov: Code coverage reporting for tests

Module 5: Model Serving

For deploying models as APIs:

cd module-5
pip install -r requirements.txt

Key dependencies include:

fastapi - Modern web framework for building APIs
uvicorn - ASGI server for running FastAPI apps
pydantic - Data validation using Python type hints
streamlit - Build interactive web UIs for models

Development Tools

Code Formatting and Linting

This repository uses Ruff for code quality. It’s 10-100x faster than traditional tools.

ruff format .

Configure your editor to run ruff format on save for automatic formatting.

Testing

Run tests using pytest:

pytest

Experiment Tracking

Weights & Biases Setup

Most training examples integrate with W&B for tracking experiments, metrics, and model artifacts.

Create W&B account

Get your API key

Visit wandb.ai/authorize to get your API key

Set environment variables

export WANDB_PROJECT=ml-in-production-practice
export WANDB_API_KEY=your_api_key_here

Or create a .env file:

.env

WANDB_PROJECT=ml-in-production-practice
WANDB_API_KEY=your_api_key_here

Verify the connection

python -c "import wandb; wandb.login()"

W&B is optional for local development but highly recommended for tracking experiments across multiple runs.

Container Tools (Module 1)

Docker Installation

Required for containerization and Kubernetes modules.

macOS
Linux
Windows

Download and install Docker Desktop for MacVerify installation:

docker --version
docker run hello-world

# Ubuntu/Debian
sudo apt-get update
sudo apt-get install docker.io docker-compose
sudo usermod -aG docker $USER

# Start Docker service
sudo systemctl start docker
sudo systemctl enable docker

Log out and back in for group changes to take effect.

Kubernetes Local Setup

For running Kubernetes examples locally, install kind (Kubernetes in Docker):

brew install kind
brew install kubectl

Create a local cluster:

kind create cluster --name ml-in-production
kubectl config get-contexts

Install k9s for a better Kubernetes CLI experience:

brew install derailed/k9s/k9s  # macOS
# or download from https://k9scli.io/

Run with: k9s -A

Serverless Platforms (Optional)

Modal provides serverless GPU compute for training and inference.

Install Modal

pip install modal
# or with uv:
uv pip install modal

Authenticate

modal token new

This opens a browser window for authentication.

Test your setup

cd module-1/modal-examples
modal run modal_hello_world.py

Modal offers generous free tier credits for experimentation. Perfect for running GPU-intensive training jobs without local hardware.

GPU Setup (Optional)

CUDA for Local GPU Training

If you have an NVIDIA GPU and want to train locally:

Check GPU availability

nvidia-smi

Install PyTorch with CUDA

Visit pytorch.org for the correct command, or use:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Verify GPU access

import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU count: {torch.cuda.device_count()}")
print(f"GPU name: {torch.cuda.get_device_name(0)}")

Editor Setup

VS Code (Recommended)

Install these extensions for the best experience:

Python (ms-python.python) - IntelliSense, debugging, linting
Ruff (charliermarsh.ruff) - Fast linting and formatting
Jupyter (ms-toolsai.jupyter) - Notebook support
Docker (ms-azuretools.vscode-docker) - Container management
YAML (redhat.vscode-yaml) - Kubernetes YAML validation

Recommended settings:

settings.json

{
  "[python]": {
    "editor.defaultFormatter": "charliermarsh.ruff",
    "editor.formatOnSave": true,
    "editor.codeActionsOnSave": {
      "source.fixAll": true,
      "source.organizeImports": true
    }
  },
  "python.testing.pytestEnabled": true
}

Verify Your Setup

Run this checklist to ensure everything is configured correctly:

Setup verification checklist

# Python version
python3 --version  # Should be 3.10+

# Virtual environment
which python  # Should point to venv/bin/python

# Core packages
python -c "import transformers, datasets, torch; print('✓ Core packages installed')"

# Docker
docker --version
docker run hello-world

# Kubernetes
kubectl version --client
kind get clusters

# Code quality
ruff --version
pytest --version

# Optional: Modal
modal --version

# Optional: GPU
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}')"

Common Issues

ImportError: No module named 'transformers'

Make sure your virtual environment is activated:

which python  # Should show venv path
source venv/bin/activate
pip install -r requirements.txt

CUDA out of memory

Reduce batch size or maximum sequence length in your training config:

conf/example.json

{
  "per_device_train_batch_size": 8,  // Reduce from 16
  "max_seq_length": 128  // Reduce from 512
}

Docker permission denied

Add your user to the docker group:

sudo usermod -aG docker $USER
newgrp docker  # Or log out and back in

Port 8080 already in use

Find and kill the process using the port:

# macOS/Linux
lsof -ti:8080 | xargs kill -9

# Windows
netstat -ano | findstr :8080
taskkill /PID <PID> /F

Next Steps

Quickstart Guide

Train and deploy your first model in 10 minutes

Module 1: Infrastructure

Start with containerization and Kubernetes basics

Module 3: Training

Learn training workflows and experiment tracking

Browse All Modules

Explore all 8 course modules

Getting Started

Core Concepts

Python Environment

Requirements

Virtual Environment Setup

Core Dependencies

Module 3: Training Workflows

Module 5: Model Serving

Development Tools

Code Formatting and Linting

Testing

Experiment Tracking

Weights & Biases Setup

Container Tools (Module 1)

Docker Installation

Kubernetes Local Setup

Serverless Platforms (Optional)

GPU Setup (Optional)

CUDA for Local GPU Training

Editor Setup

VS Code (Recommended)

Verify Your Setup

Common Issues

Next Steps

Quickstart Guide

Module 1: Infrastructure

Module 3: Training

Browse All Modules

Build docs developers (and LLMs) love

Getting Started

Core Concepts

​Python Environment

​Requirements

​Virtual Environment Setup

​Core Dependencies

​Module 3: Training Workflows

​Module 5: Model Serving

​Development Tools

​Code Formatting and Linting

​Testing

​Experiment Tracking

​Weights & Biases Setup

​Container Tools (Module 1)

​Docker Installation

​Kubernetes Local Setup

​Serverless Platforms (Optional)

​Modal Setup

​GPU Setup (Optional)

​CUDA for Local GPU Training

​Editor Setup

​VS Code (Recommended)

​Verify Your Setup

​Common Issues

​Next Steps

Quickstart Guide

Module 1: Infrastructure

Module 3: Training

Browse All Modules

Build docs developers (and LLMs) love

Python Environment

Requirements

Virtual Environment Setup

Core Dependencies

Module 3: Training Workflows

Module 5: Model Serving

Development Tools

Code Formatting and Linting

Testing

Experiment Tracking

Weights & Biases Setup

Container Tools (Module 1)

Docker Installation

Kubernetes Local Setup

Serverless Platforms (Optional)

Modal Setup

GPU Setup (Optional)

CUDA for Local GPU Training

Editor Setup

VS Code (Recommended)

Verify Your Setup

Common Issues

Next Steps