Skip to main content
Status: Future Phase — Do Not Start Until Phase 5 StableThis is a dedicated sub-project with its own runbook. It is listed here for architectural awareness — v3 hardware and network design consciously leaves headroom for it.

Objective

Introduce Kubernetes as an isolated learning cluster. No production services migrate until the cluster is fully understood and stable. Build skills deliberately — sandbox phase is complete before any production workload is considered.

Entry Criteria

Phase 5 complete — all backup tiers validated, monitoring confirmed, v3 documented and stable

Locked Approach

Sandbox First. Production Later.Production services stay on Docker until k3s is proven stable and operator is comfortable. The sandbox cluster runs in parallel — failure there has zero impact on running services.Only after sandbox phase is complete will selective production migration begin.

Planned Architecture

Cluster Design

  • Talos Linux VMs on both Proxmox nodes
  • MS-A2: Talos control plane VM + Talos worker 1
  • Optiplex: Talos worker 2
  • Storage: Longhorn for PVCs inside the cluster
  • Ingress: Traefik ingress controller (familiar from Docker context — same mental model)
  • GitOps: Flux or ArgoCD

Tooling to Learn

Core Kubernetes

  • kubectl
  • Pods, deployments, services, namespaces
  • ConfigMaps, Secrets

Storage

  • Longhorn
  • PVCs and PVs
  • Storage classes

Networking

  • Traefik ingress controller
  • cert-manager for TLS
  • Ingress resources

GitOps & Automation

  • Flux or ArgoCD
  • Helm charts
  • Operators

Sandbox Phase (No Production Traffic)

1

Stand Up Cluster

  • Create Talos VMs on pve-prod-01 and pve-prod-02
  • Bootstrap Talos cluster
  • Configure kubectl access
2

Learn Core Primitives

  • Deploy test workloads
  • Learn pods, deployments, services, namespaces
  • Understand ConfigMaps and Secrets
3

Configure Storage

  • Install Longhorn
  • Create storage classes
  • Test PVC binding
4

Configure Ingress

  • Install Traefik ingress controller
  • Install cert-manager
  • Test TLS certificate issuance
5

Learn GitOps

  • Install Flux or ArgoCD
  • Deploy test apps via GitOps
  • Understand reconciliation loops
6

Break Things and Recover

Deliberately break things and recover — that is the point of the sandbox

Production Migration Candidates (Post-Sandbox Only)

These services benefit from HA and operator-managed upgrades — they are good candidates for k3s migration once the cluster is stable.
ServiceReason
ImmichBenefits from HA and operator-managed upgrades
AuthentikIdP should be highly available
Beszel / Uptime KumaMonitoring infrastructure
TraefikAlready the ingress controller in k3s, natural fit

Intentionally Staying in Docker

These services have filesystem dependencies or network complexity that do not translate cleanly to k8s. They stay in Docker permanently.
ServiceReason
ARR stack (Sonarr, Radarr, Prowlarr, Bazarr)Hardlinks and atomic moves make k8s messy
qBittorrent + GluetunVPN killswitch model does not translate cleanly to k8s networking
Books stack (CWA, ABS, Shelfmark)Ingest/hardlink workflows are filesystem-dependent

VM Resource Allocation

VMHostvCPURAMRole
k3s-ctrl-lab-01pve-prod-0124GBControl plane
k3s-work-lab-01pve-prod-0148GBWorker node
k3s-work-lab-02pve-prod-0248GBWorker node

Key Constraints

Do Not Rush Phase 6A broken k3s cluster on top of an unstable foundation helps nobody. Phase 5 fully stable — reliable backups, clean monitoring, solid documentation — is the only acceptable entry point for Phase 6.
No Production Services Until Sandbox CompleteThe sandbox cluster runs in parallel with production Docker services. Zero production traffic touches the k3s cluster until it is proven stable.

Learning Resources

Exit Criteria

  • No unexpected crashes or restarts
  • All test workloads running reliably
  • Storage, networking, and ingress working as expected
  • kubectl commands second nature
  • Troubleshooting pods, logs, events routine
  • GitOps workflow understood and practiced
  • Which services migrate first
  • Rollback plan for each service
  • Monitoring and alerting for k3s cluster
This Phase Is Its Own ProjectPhase 6 is a dedicated learning environment. It has its own timeline, its own success criteria, and its own documentation. It exists in parallel with production infrastructure — not as a replacement.

Build docs developers (and LLMs) love