System Architecture
K8s Scheduler is built with three main components that work together to provide a complete multi-tenant deployment platform:Data Flow
Here’s how a deployment is created from user action to running pods:User initiates deployment
User fills out the deployment form in the React UI, selecting a template, providing environment variables, and configuring secrets.
API request with validation
The frontend sends a
POST /api/deployments request. The Go server validates the request, checks RBAC permissions, and enforces tier limits.Database and secrets storage
Server stores deployment metadata in PostgreSQL and writes secrets to Vault at the appropriate paths (
users/{id}/deployments/{name}).Kubernetes CR creation
Server creates a
UserDeployment custom resource in the user’s Kubernetes namespace using the controller-runtime client.Operator reconciliation
The operator’s reconciliation loop detects the new CR and creates all necessary Kubernetes resources:
- Deployments (one per service in the template)
- Services (ClusterIP for each deployment)
- Ingress (with Traefik annotations for routing)
- ExternalSecrets (to sync from Vault)
- NetworkPolicies (for tier-based isolation)
External Secrets Operator sync
ESO watches the ExternalSecret resources and fetches secrets from Vault, creating Kubernetes Secret objects.
Pods start
Kubernetes schedules pods for each deployment. Secrets are mounted as environment variables and/or files.
Component Interactions
React Frontend → Go Server
- Protocol: HTTP/REST API
- Authentication: Session cookies (server sets HttpOnly cookie after OAuth)
- State management: TanStack Query caches responses, handles refetching
- Real-time updates: Server-Sent Events (SSE) for deployment logs and metrics
Go Server → PostgreSQL
- Driver:
pgx/v5(native Go PostgreSQL driver) - Connection pooling: Managed by pgxpool
- Migrations: Applied on startup using embedded SQL files
- Transactions: Used for RBAC operations (e.g., creating org + default team)
Go Server → Vault
- Client: Official HashiCorp Vault Go SDK
- Authentication: Token-based (from file or environment variable)
- Paths: KV v2 secrets engine at
secret/mount - Operations: Read, write, delete secrets at user/template/deployment scopes
Go Server → Kubernetes
- Client:
controller-runtimeclient for CRs,client-gofor logs/exec - Authentication: In-cluster config (ServiceAccount) or kubeconfig
- Operations: Create/delete UserDeployment CRs, stream logs, execute commands
Operator → Kubernetes
- Framework:
controller-runtime(manager, controller, reconciler pattern) - Watches: UserDeployment, AgentTask, Workflow CRs
- Creates: Deployments, Services, Ingresses, ExternalSecrets, NetworkPolicies
- Updates: CR status field with deployment state and URLs
Directory Structure
The codebase is organized into clear layers:Tech Stack
Frontend
- React 19 with TypeScript 5.9
- Vite 7 for development and builds
- TanStack Query 5 for server state
- React Router 7 for routing
- Tailwind CSS 4 for styling
- React Hook Form + Zod for forms
Backend
- Go 1.24 with standard library HTTP server
- pgx/v5 for PostgreSQL
- controller-runtime for Kubernetes operator
- HashiCorp Vault SDK for secrets
- Stripe Go SDK for billing
- log/slog for structured logging
Infrastructure
- Kubernetes (any CNCF-conformant cluster)
- PostgreSQL for persistent data
- Vault or AWS Secrets Manager for secrets
- External Secrets Operator for K8s secret injection
- Traefik for ingress routing
- cert-manager for TLS certificates
Scaling and High Availability
Server
- Horizontal scaling: Multiple replicas behind a load balancer
- Session storage: Redis or PostgreSQL for shared session state
- Database: PostgreSQL with read replicas
- Secrets: Vault HA cluster with Raft storage
Operator
- Leader election: Only one active reconciler at a time (built into controller-runtime)
- Multiple replicas: Standby replicas ready for failover
- Work queue: Kubernetes workqueue ensures no lost events
The operator uses leader election by default. Only one replica actively reconciles CRs, but multiple replicas can run for high availability. If the leader fails, another replica automatically takes over.
Security
- Authentication: OAuth2 with Google (production) or auto-login (dev mode)
- Authorization: RBAC with org/team hierarchy, enforced by middleware on every request
- Secrets: Never stored in plain text; always in Vault/AWS Secrets Manager
- Network isolation: NetworkPolicies restrict traffic between user namespaces
- TLS: All external traffic uses HTTPS via cert-manager and Let’s Encrypt
- Session security: HttpOnly, Secure, SameSite cookies
Next Steps
Multi-Tenancy
Learn how organizations, teams, and white-labeling work
RBAC
Understand the role-based access control system
Secrets
Deep dive into the three-tier secrets architecture
Templates
Create reusable deployment templates