Introduction
Docker allows you to package your ML applications with all dependencies into portable containers. This ensures consistency across development, testing, and production environments.Containers are lightweight, isolated environments that share the host OS kernel, making them more efficient than virtual machines.
Getting Started
Hello World
Test your Docker installation by running the official hello-world image:Building ML Containers
Module 1 includes two sample applications that demonstrate different containerization patterns.ML Training Application
Theapp-ml container simulates a machine learning training job:
The
-it flag provides an interactive terminal, --rm automatically removes the container after exit, and --name assigns a friendly identifier.Web Application
Theapp-web container runs a simple HTTP server for serving models:
Multi-Stage Builds
For projects with multiple applications, use multi-stage Dockerfiles to share common dependencies:- Web App
- ML App
Benefits of Multi-Stage Builds
- Shared dependencies: Install common packages once in the base layer
- Smaller images: Only include what’s needed for each target
- Single source: Maintain one Dockerfile for multiple applications
- Consistent environments: Ensure all apps use identical base configurations
Sharing Images
To deploy containers to production, push them to a container registry.GitHub Container Registry
GitHub Container Registry (ghcr.io) provides free storage for public images and is tightly integrated with GitHub Actions for automated CI/CD workflows.
Container Registry Options
Choose a registry that fits your infrastructure and requirements:| Registry | Best For | Notes |
|---|---|---|
| GitHub Container Registry | GitHub users, CI/CD integration | Free for public images |
| Docker Hub | Public images, community sharing | Most popular registry |
| Amazon ECR | AWS deployments | Integrates with ECS/EKS |
| Google Container Registry | GCP deployments | Integrates with GKE |
Best Practices
Image Optimization
- Use specific base images:
python:3.12-sliminstead ofpython:3.12for smaller images - Leverage build cache: Order Dockerfile commands from least to most frequently changing
- Multi-stage builds: Separate build and runtime dependencies
- Minimize layers: Combine RUN commands with
&&
Security
- Scan images for vulnerabilities with
docker scan - Use official base images from trusted sources
- Run containers as non-root users when possible
- Keep base images updated
Development Workflow
Debugging Containers
Inspecting Running Containers
Common Issues
- Port Already in Use
- Image Won't Build
- Container Exits Immediately
Resources
Essential Reading
- Docker Curriculum - Comprehensive Docker tutorial
- Docker and Python for Data Science - Best practices video
- 0 to Production-Ready Docker - Production patterns