Skip to main content

Docker Fundamentals

Docker is a platform for developing, shipping, and running applications in containers. Containers package an application with all its dependencies, ensuring consistency across different environments.
How Docker Works

Docker Architecture

Docker has three main components that work together:

Docker Client

The command-line interface that users interact with. The docker client talks to the Docker daemon

Docker Host

The Docker daemon listens for Docker API requests and manages Docker objects such as images, containers, networks, and volumes

Docker Registry

A Docker registry stores Docker images. Docker Hub is a public registry that anyone can use

How Docker Run Works

When you execute docker run, the following sequence occurs:
1

Image Pull

Docker pulls the image from the registry (if not already available locally)
2

Container Creation

Docker creates a new container from the image
3

Filesystem Allocation

Docker allocates a read-write filesystem to the container
4

Network Setup

Docker creates a network interface to connect the container to the default network
5

Container Start

Docker starts the container and executes the specified command

Essential Docker Concepts

8 Must-Know Docker Concepts
Contains the instructions to build a Docker image by specifying the base image, dependencies, and run command.
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["npm", "start"]
A lightweight, standalone package that includes everything (code, libraries, and dependencies) needed to run your application. Images are built from a Dockerfile and can be versioned.Key Characteristics:
  • Immutable and portable
  • Layered architecture for efficiency
  • Can be versioned with tags
  • Stored in registries (Docker Hub, ECR, etc.)
A running instance of a Docker image. Containers are isolated from each other and the host system, providing a secure and reproducible environment for running your apps.Key Characteristics:
  • Lightweight and fast to start
  • Isolated process and filesystem
  • Ephemeral by default
  • Can be paused, stopped, and restarted
A centralized repository for storing and distributing Docker images. For example, Docker Hub is the default public registry but you can also set up private registries.Popular Registries:
  • Docker Hub (public)
  • Amazon ECR
  • Google Container Registry
  • Azure Container Registry
A way to persist data generated by containers. Volumes are outside the container’s file system and can be shared between multiple containers.Benefits:
  • Data persists after container removal
  • Can be shared between containers
  • Managed independently of containers
  • Better performance than bind mounts
A tool for defining and running multi-container Docker applications, making it easy to manage the entire stack.
version: '3.8'
services:
  web:
    build: .
    ports:
      - "3000:3000"
  db:
    image: postgres:14
    environment:
      POSTGRES_PASSWORD: secret
Used to enable communication between containers and the host system. Custom networks can isolate containers or enable selective communication.Network Types:
  • Bridge (default)
  • Host
  • Overlay (for Swarm)
  • None
The primary way to interact with Docker, providing commands for building images, running containers, managing volumes, and performing other operations.

Docker Best Practices

9 Docker Best Practices

1. Use Official Images

This ensures security, reliability, and regular updates from trusted sources

2. Use Specific Image Versions

The default latest tag is unpredictable and causes unexpected behavior. Always specify version numbers
# Bad
FROM node:latest

# Good
FROM node:18.17.0-alpine

3. Multi-Stage Builds

Reduces final image size by excluding build tools and dependencies
# Build stage
FROM node:18 AS builder
WORKDIR /app
COPY . .
RUN npm ci && npm run build

# Production stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
CMD ["node", "dist/index.js"]

4. Use .dockerignore

Excludes unnecessary files, speeds up builds, and reduces image size
node_modules
.git
.env
*.log

5. Use Least Privileged User

Enhances security by limiting container privileges
FROM node:18-alpine
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

6. Use Environment Variables

Increases flexibility and portability across different environments
ENV NODE_ENV=production
ENV PORT=3000

7. Order Matters for Caching

Order your steps from least to most frequently changing to optimize caching
# Dependencies change less frequently
COPY package*.json ./
RUN npm ci

# Code changes more frequently
COPY . .

8. Label Your Images

It improves organization and helps with image management
LABEL maintainer="[email protected]"
LABEL version="1.0"
LABEL description="Production API service"

9. Scan Images for Vulnerabilities

Find security vulnerabilities before they become bigger problems
docker scan myimage:latest

Kubernetes Overview

Kubernetes Architecture
Kubernetes (k8s) is a container orchestration system used for container deployment and management. Its design is greatly impacted by Google’s internal system Borg. A Kubernetes cluster consists of a set of worker machines, called nodes, that run containerized applications. Every cluster has at least one worker node. The worker node(s) host the Pods that are the components of the application workload. The control plane manages the worker nodes and the Pods in the cluster.
In production environments, the control plane usually runs across multiple computers and a cluster usually runs multiple nodes, providing fault tolerance and high availability.

Control Plane Components

The API server talks to all the components in the Kubernetes cluster. All the operations on pods are executed by talking to the API server.Key Functions:
  • Exposes the Kubernetes API
  • Front-end for the control plane
  • Validates and processes REST requests
  • Updates etcd with cluster state
The scheduler watches the workloads on pods and assigns loads on newly created pods.Responsibilities:
  • Selects optimal nodes for pods
  • Considers resource requirements
  • Respects constraints and affinity rules
  • Balances workload distribution
The controller manager runs the controllers, including:
  • Node Controller: Monitors node health
  • Job Controller: Manages one-off tasks
  • EndpointSlice Controller: Manages network endpoints
  • ServiceAccount Controller: Manages service accounts
etcd is a key-value store used as Kubernetes’ backing store for all cluster data.Characteristics:
  • Distributed and highly available
  • Stores cluster configuration and state
  • Source of truth for the cluster
  • Should be backed up regularly

Node Components

A pod is a group of containers and is the smallest unit that Kubernetes administers. Pods have a single IP address applied to every container within the pod.Key Concepts:
  • Shared network namespace
  • Shared storage volumes
  • Ephemeral by nature
  • Scaled through ReplicaSets
An agent that runs on each node in the cluster. It ensures containers are running in a Pod.Responsibilities:
  • Monitors pod specifications
  • Ensures containers are healthy
  • Reports node and pod status
  • Manages container lifecycle
kube-proxy is a network proxy that runs on each node in your cluster. It routes traffic coming into a node from the service. It forwards requests for work to the correct containers.Functions:
  • Maintains network rules
  • Enables service abstraction
  • Load balances traffic
  • Implements service networking

Kubernetes Service Types

Kubernetes Service Types
In Kubernetes, a Service is a method for exposing a network application in the cluster. The “type” property in the Service’s specification determines how the service is exposed to the network.

ClusterIP

Default service type. Kubernetes assigns a cluster-internal IP address to ClusterIP service. This makes the service only reachable within the cluster.Use Cases:
  • Internal microservices communication
  • Backend services not exposed externally
  • Database connections within cluster

NodePort

Exposes the service outside of the cluster by adding a cluster-wide port on top of ClusterIP. We can request the service by NodeIP:NodePort.Use Cases:
  • Development and testing
  • Direct access to services
  • Port range: 30000-32767

LoadBalancer

Exposes the Service externally using a cloud provider’s load balancer.Use Cases:
  • Production external services
  • Cloud-native applications
  • Automatic load balancing
  • Integrates with AWS ELB, GCP Load Balancer, etc.

ExternalName

Maps a Service to a domain name. This is commonly used to create a service within Kubernetes to represent an external database.Use Cases:
  • External database connections
  • Third-party API integration
  • Service migration scenarios

Kubernetes Deployment Strategies

Kubernetes Deployment Strategies
Each strategy offers a unique approach to manage updates:
All existing instances are terminated at once, and new instances with the updated version are created.
  • Downtime: Yes
  • Use case: Non-critical applications or during initial development stages
  • Risk: High
  • Speed: Fast
Application instances are updated one by one, ensuring high availability during the process.
  • Downtime: No
  • Use case: Periodic releases
  • Risk: Low to medium
  • Speed: Medium
This is the default Kubernetes deployment strategy.
A copy of the live traffic is redirected to the new version for testing without affecting production users.
  • Downtime: No
  • Use case: Validating new version performance and behavior in a real environment
  • Risk: Low
  • Complexity: Very high
This is the most complex deployment strategy and involves establishing mock services to interact with the new version of the deployment.
The new version is released to a subset of users or servers for testing before broader deployment.
  • Downtime: No
  • Use case: Impact validation on a subset of users
  • Risk: Low
  • Speed: Gradual
Typically starts with 5-10% of traffic, then gradually increases.
Two identical environments are maintained: one with the current version (blue) and the other with the updated version (green). Traffic starts with blue, then switches to the prepared green environment for the updated version.
  • Downtime: No
  • Use case: High-stake updates
  • Risk: Low (easy rollback)
  • Cost: High (requires double resources)
Multiple versions are concurrently tested on different users to compare performance or user experience.
  • Downtime: Not directly applicable
  • Use case: Optimizing user experience
  • Duration: Extended (weeks to months)
  • Focus: Business metrics and user behavior

Kubernetes Command Cheatsheet

Kubernetes Command Cheatsheet
Kubernetes is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications. Initially developed by Google, Kubernetes is now maintained by CNCF (Cloud Native Computing Foundation).

Essential Commands by Category

# View cluster information
kubectl cluster-info

# View nodes
kubectl get nodes

# Describe node details
kubectl describe node <node-name>

# View cluster events
kubectl get events

Kubernetes Tools Ecosystem

Kubernetes Tools Ecosystem
Kubernetes, the leading container orchestration platform, boasts a vast ecosystem of tools and components that collectively empower organizations to efficiently deploy, manage, and scale containerized applications.

Security

Tools for authentication, authorization, encryption, and compliance

Networking

CNI plugins, service mesh, ingress controllers, and network policies

Container Runtime

Docker, containerd, CRI-O, and other container runtimes

Cluster Management

Tools for provisioning, scaling, and managing clusters

Monitoring & Observability

Prometheus, Grafana, ELK stack, and distributed tracing

Infrastructure Orchestration

Terraform, Ansible, Helm, and GitOps tools
Kubernetes practitioners need to be well-versed in these tools to ensure the reliability, security, and performance of containerized applications within Kubernetes clusters.

Key Takeaways

Docker Basics: Containers package applications with dependencies for consistency across environments
Docker Best Practices: Use official images, specific versions, multi-stage builds, and scan for vulnerabilities
Kubernetes Architecture: Control plane manages worker nodes that run containerized pods
Service Types: Choose between ClusterIP, NodePort, LoadBalancer, or ExternalName based on your needs
Deployment Strategies: Select the right strategy (Rolling, Canary, Blue-Green, etc.) based on risk tolerance and downtime requirements

DevOps & CI/CD

Learn about DevOps practices and CI/CD pipelines

Monitoring & Logging

Explore observability and system monitoring

Build docs developers (and LLMs) love