Skip to main content

ML in Production Practice

Welcome to the ML in Production Practice documentation. This repository contains hands-on exercises and reference implementations for the ML in Production course.

What You’ll Learn

This course covers the complete lifecycle of taking machine learning models from development to production, including:

Infrastructure & Deployment

Containerization with Docker, Kubernetes orchestration, and CI/CD pipelines

Data Management

Storage systems, data formats, streaming pipelines, and vector databases

Training Workflows

Experiment tracking, configuration management, and distributed training

Model Serving

FastAPI endpoints, inference servers, and LLM deployment with vLLM

Pipeline Orchestration

Airflow, Kubeflow, and Dagster for ML workflows

Optimization

Load testing, autoscaling, and model quantization

Monitoring

Observability, metrics, and data drift detection

Production Patterns

Cloud platforms, buy vs build decisions, and best practices

Course Structure

The course is organized into 8 self-contained modules. You can work through them sequentially or dive into specific topics based on your needs.
1

Module 1: Infrastructure

Learn containerization with Docker, Kubernetes deployment, CI/CD workflows, and serverless platforms like Modal and Railway.
2

Module 2: Data Management

Set up MinIO storage, work with streaming datasets, implement vector databases, and configure data labeling with Argilla.
3

Module 3: Training Workflows

Structure ML projects, track experiments, create model cards, and train both classic BERT models and fine-tune LLMs.
4

Module 4: Pipeline Orchestration

Build ML pipelines with Airflow, Kubeflow, and Dagster for production workflows.
5

Module 5: Model Serving

Deploy models with FastAPI, create Streamlit interfaces, use Triton and KServe inference servers, and serve LLMs with vLLM.
6

Module 6: Optimization

Conduct load testing, implement autoscaling, optimize models with quantization, and set up async inference.
7

Module 7: Monitoring

Configure observability with SigNoz, set up Grafana dashboards, and monitor data drift with Evidently and Seldon.
8

Module 8: Cloud Platforms

Evaluate buy vs build decisions and deploy multi-model endpoints on AWS SageMaker.

Prerequisites

  • Python 3.10 or higher
  • Basic understanding of machine learning concepts
  • Familiarity with command line operations
  • Docker installed (for containerization modules)

Quick Start

Ready to get started? Head over to the Quickstart guide to set up your environment and run your first example.

Quickstart Guide

Get up and running in minutes

Setup Instructions

Detailed environment setup

Browse Modules

Explore the course modules

GitHub Repository

View the source code

Community & Support

Course Page

Access the full course curriculum

Discord Community

Ask questions and connect with others

Blog

Read additional articles and insights

Report Issues

Found a problem? Let us know

Versioning

The main branch contains the most up-to-date materials. A protected 2024-version branch preserves the 2024 and early 2025 edition of the course for reference.

Build docs developers (and LLMs) love