Workloads

The Workload resource provides a provider-agnostic way to manage groups of compute instances (VMs or containers) across multiple cloud providers and locations.

Overview

Workloads abstract away provider-specific details, letting you focus on what you want to run rather than how to provision it. Datum’s Workload Operator handles placement, provisioning, scaling, and lifecycle management. Key Capabilities:

Provider-agnostic instance management
Placement rules (where instances should run)
Automatic scaling
Network attachments
Volume mounts
Instance templates

Workload Resource

Basic Example

apiVersion: compute.datumapis.com/v1alpha1
kind: Workload
metadata:
  name: web-app
  labels:
    app: nginx
    tier: frontend
spec:
  # Number of instances
  replicas: 3
  
  # Instance template
  template:
    metadata:
      labels:
        app: nginx
    spec:
      # Machine type
      machineType: e2-small
      
      # Image to run
      image: nginx:latest
      
      # Network configuration
      networkInterfaces:
        - networkRef:
            name: production-network

Advanced Example

apiVersion: compute.datumapis.com/v1alpha1
kind: Workload
metadata:
  name: backend-api
  annotations:
    kubernetes.io/description: "Backend API service"
spec:
  replicas: 5
  
  # Placement rules - where instances should run
  placement:
    # Prefer multiple zones for HA
    topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
    
    # Region constraints
    nodeSelector:
      topology.kubernetes.io/region: us-central1
    
    # Provider preferences
    affinity:
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            preference:
              matchExpressions:
                - key: provider
                  operator: In
                  values:
                    - gcp
  
  # Instance template
  template:
    metadata:
      labels:
        app: backend
        version: v2.1
    spec:
      machineType: n2-standard-4
      
      # Container or VM image
      image: gcr.io/my-project/backend-api:v2.1
      
      # Environment variables
      env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: url
        - name: LOG_LEVEL
          value: info
      
      # Network interfaces
      networkInterfaces:
        - networkRef:
            name: backend-network
          subnetRef:
            name: backend-subnet
        - networkRef:
            name: data-network
      
      # Persistent volumes
      volumes:
        - name: data
          persistentVolumeClaim:
            claimName: backend-data
      
      volumeMounts:
        - name: data
          mountPath: /var/lib/app/data
      
      # Startup command
      command:
        - /app/backend
        - --config=/etc/backend/config.yaml
      
      # Resource requests and limits
      resources:
        requests:
          cpu: "2"
          memory: 4Gi
        limits:
          cpu: "4"
          memory: 8Gi

Instance Templates

The template field defines the specification for each instance:

Machine Types

spec:
  template:
    spec:
      # GCP-style machine types
      machineType: e2-micro       # 2 vCPUs, 1 GB RAM
      # machineType: e2-small     # 2 vCPUs, 2 GB RAM
      # machineType: e2-medium    # 2 vCPUs, 4 GB RAM
      # machineType: n2-standard-4 # 4 vCPUs, 16 GB RAM
      # machineType: c2-standard-8 # 8 vCPUs, 32 GB RAM

Machine type names follow GCP conventions but work across providers. The Workload Operator translates them to provider-specific equivalents.

Image Types

Support for VM images and container images:

Container Image
VM Image

spec:
  template:
    spec:
      image: nginx:latest
      # Or from a private registry
      # image: gcr.io/my-project/app:v1.0
      
      # Image pull credentials
      imagePullSecrets:
        - name: gcr-credentials

Uses gVisor sandboxing for secure container execution.

spec:
  template:
    spec:
      # OS image from provider image library
      image: projects/debian-cloud/global/images/debian-11-bullseye-v20230912
      
      # Or custom image
      # image: projects/my-project/global/images/my-custom-image

Provisions full virtual machines.

Placement Rules

Control where instances are deployed using placement constraints:

Topology Spread

Distribute instances across zones or regions:

spec:
  placement:
    topologySpreadConstraints:
      # Spread across zones
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: web-app
      
      # Spread across providers
      - maxSkew: 2
        topologyKey: provider
        whenUnsatisfiable: ScheduleAnyway

Node Selector

Require specific location or provider:

spec:
  placement:
    nodeSelector:
      topology.kubernetes.io/region: us-central1
      topology.kubernetes.io/zone: us-central1-a
      provider: gcp

Affinity and Anti-Affinity

spec:
  placement:
    affinity:
      # Prefer certain providers
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            preference:
              matchExpressions:
                - key: provider
                  operator: In
                  values:
                    - gcp
                    - aws
      
      # Co-locate with other workloads
      podAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: database
            topologyKey: topology.kubernetes.io/zone
      
      # Keep away from other workloads
      podAntiAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: web-app
              topologyKey: kubernetes.io/hostname

Scaling

Manual Scaling

Change the replica count:

kubectl scale workload web-app --replicas=10

Or edit the resource:

kubectl edit workload web-app

Declarative Scaling

Update your YAML and reapply:

spec:
  replicas: 10  # Scaled from 3 to 10

kubectl apply -f workload.yaml

Automatic scaling via HPA (Horizontal Pod Autoscaler) is coming soon. Track progress in the enhancements repo.

Network Configuration

Attach instances to one or more networks:

Single Network

spec:
  template:
    spec:
      networkInterfaces:
        - networkRef:
            name: production-network

Multiple Networks

spec:
  template:
    spec:
      networkInterfaces:
        # Primary network (internet access)
        - networkRef:
            name: public-network
          primary: true
        
        # Backend network
        - networkRef:
            name: backend-network
          subnetRef:
            name: backend-subnet
        
        # Data network
        - networkRef:
            name: data-network

Static IP Assignment

spec:
  template:
    spec:
      networkInterfaces:
        - networkRef:
            name: production-network
          ipAddress: 10.0.1.100  # Static IP

Storage Volumes

Persistent Volumes

spec:
  template:
    spec:
      volumes:
        - name: app-data
          persistentVolumeClaim:
            claimName: app-data-pvc
        
        - name: config
          configMap:
            name: app-config
        
        - name: secrets
          secret:
            secretName: app-secrets
      
      volumeMounts:
        - name: app-data
          mountPath: /var/lib/app
        - name: config
          mountPath: /etc/app/config
          readOnly: true
        - name: secrets
          mountPath: /etc/app/secrets
          readOnly: true

Workload Lifecycle

How Workloads are Deployed

User creates Workload

kubectl apply -f workload.yaml

Workload Operator reconciles

Evaluates placement rules
Determines target locations (zones, providers)
Creates WorkloadDeployment for each location

WorkloadDeployment creates Instances

Generates Instance resources from template
Applies replica count per location

Infrastructure Plugin provisions

Watches for Instance resources
Creates VMs or containers in target provider
Attaches to networks
Mounts volumes

Status updates

Plugin updates Instance status to Running
Workload status becomes Ready

Workload Status

kubectl get workload web-app

NAME      REPLICAS   READY   AGE
web-app   3          3       5m

View detailed status:

kubectl describe workload web-app

Managing Workloads

Create a Workload

kubectl apply -f workload.yaml

List Workloads

kubectl get workloads

View Workload Details

kubectl describe workload web-app

Update a Workload

kubectl edit workload web-app
# Or
kubectl apply -f workload.yaml

Scale a Workload

kubectl scale workload web-app --replicas=5

Delete a Workload

kubectl delete workload web-app

Deleting a workload will terminate all instances. Data in non-persistent volumes will be lost.

View Instances

# List all instances
kubectl get instances

# Filter by workload
kubectl get instances -l workload=web-app

# Detailed instance info
kubectl describe instance web-app-us-central1-a-0

Troubleshooting

Workload not becoming Ready

# Check workload status
kubectl describe workload <workload-name>

# Check events
kubectl get events --field-selector involvedObject.name=<workload-name>

# Check Workload Operator logs
kubectl logs -n datum-system -l app=workload-operator

Instances not starting

# List instances
kubectl get instances

# Check specific instance
kubectl describe instance <instance-name>

# Check infrastructure plugin logs
kubectl logs -n datum-system -l app=infra-provider-gcp

Placement constraints not satisfied

# View placement events
kubectl describe workload <workload-name> | grep -A 20 Events

# Check available capacity
kubectl describe nodes

Best Practices

Use labels

Label workloads for organization and selection (app, tier, version, environment).

Spread for HA

Use topology spread constraints to distribute across zones.

Resource limits

Set resource requests and limits to ensure predictable performance.

Health checks

Configure liveness and readiness probes (coming soon).

Rolling updates

Use rolling update strategy for zero-downtime deployments.

Persistent data

Use persistent volumes for data that must survive instance restarts.

Common Patterns

Web Application Tier

apiVersion: compute.datumapis.com/v1alpha1
kind: Workload
metadata:
  name: web-frontend
spec:
  replicas: 5
  placement:
    topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
  template:
    spec:
      machineType: e2-medium
      image: nginx:alpine
      networkInterfaces:
        - networkRef:
            name: public-network

Background Worker

apiVersion: compute.datumapis.com/v1alpha1
kind: Workload
metadata:
  name: worker
spec:
  replicas: 3
  template:
    spec:
      machineType: n2-standard-2
      image: gcr.io/my-project/worker:latest
      env:
        - name: QUEUE_URL
          value: redis://redis-service:6379
      networkInterfaces:
        - networkRef:
            name: backend-network

Next Steps

Gateways

Expose workloads with the Gateway API

Networks

Configure networking for workloads

Managing Resources

Learn kubectl commands for workloads

Workload Operator

Explore the source code

Get Started

Core Concepts

Deployment

Operations

​Workloads

​Overview

​Workload Resource

​Basic Example

​Advanced Example

​Instance Templates

​Machine Types

​Image Types

​Placement Rules

​Topology Spread

​Node Selector

​Affinity and Anti-Affinity

​Scaling

​Manual Scaling

​Declarative Scaling

​Network Configuration

​Single Network

​Multiple Networks

​Static IP Assignment

​Storage Volumes

​Persistent Volumes

​Workload Lifecycle

​How Workloads are Deployed

​Workload Status

​Managing Workloads

​Create a Workload

​List Workloads

​View Workload Details

​Update a Workload

​Scale a Workload

​Delete a Workload

​View Instances

​Troubleshooting

​Workload not becoming Ready

​Instances not starting

​Placement constraints not satisfied

​Best Practices

Use labels

Spread for HA

Resource limits

Health checks

Rolling updates

Persistent data

​Common Patterns

​Web Application Tier

​Background Worker

​Next Steps

Gateways

Networks

Managing Resources

Workload Operator

Build docs developers (and LLMs) love

Workloads

Overview

Workload Resource

Basic Example

Advanced Example

Instance Templates

Machine Types

Image Types

Placement Rules

Topology Spread

Node Selector

Affinity and Anti-Affinity

Scaling

Manual Scaling

Declarative Scaling

Network Configuration

Single Network

Multiple Networks

Static IP Assignment

Storage Volumes

Persistent Volumes

Workload Lifecycle

How Workloads are Deployed

Workload Status

Managing Workloads

Create a Workload

List Workloads

View Workload Details

Update a Workload

Scale a Workload

Delete a Workload

View Instances

Troubleshooting

Workload not becoming Ready

Instances not starting

Placement constraints not satisfied

Best Practices

Common Patterns

Web Application Tier

Background Worker

Next Steps