Skip to main content

Workloads

The Workload resource provides a provider-agnostic way to manage groups of compute instances (VMs or containers) across multiple cloud providers and locations.

Overview

Workloads abstract away provider-specific details, letting you focus on what you want to run rather than how to provision it. Datum’s Workload Operator handles placement, provisioning, scaling, and lifecycle management. Key Capabilities:
  • Provider-agnostic instance management
  • Placement rules (where instances should run)
  • Automatic scaling
  • Network attachments
  • Volume mounts
  • Instance templates

Workload Resource

Basic Example

apiVersion: compute.datumapis.com/v1alpha1
kind: Workload
metadata:
  name: web-app
  labels:
    app: nginx
    tier: frontend
spec:
  # Number of instances
  replicas: 3
  
  # Instance template
  template:
    metadata:
      labels:
        app: nginx
    spec:
      # Machine type
      machineType: e2-small
      
      # Image to run
      image: nginx:latest
      
      # Network configuration
      networkInterfaces:
        - networkRef:
            name: production-network

Advanced Example

apiVersion: compute.datumapis.com/v1alpha1
kind: Workload
metadata:
  name: backend-api
  annotations:
    kubernetes.io/description: "Backend API service"
spec:
  replicas: 5
  
  # Placement rules - where instances should run
  placement:
    # Prefer multiple zones for HA
    topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
    
    # Region constraints
    nodeSelector:
      topology.kubernetes.io/region: us-central1
    
    # Provider preferences
    affinity:
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            preference:
              matchExpressions:
                - key: provider
                  operator: In
                  values:
                    - gcp
  
  # Instance template
  template:
    metadata:
      labels:
        app: backend
        version: v2.1
    spec:
      machineType: n2-standard-4
      
      # Container or VM image
      image: gcr.io/my-project/backend-api:v2.1
      
      # Environment variables
      env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: url
        - name: LOG_LEVEL
          value: info
      
      # Network interfaces
      networkInterfaces:
        - networkRef:
            name: backend-network
          subnetRef:
            name: backend-subnet
        - networkRef:
            name: data-network
      
      # Persistent volumes
      volumes:
        - name: data
          persistentVolumeClaim:
            claimName: backend-data
      
      volumeMounts:
        - name: data
          mountPath: /var/lib/app/data
      
      # Startup command
      command:
        - /app/backend
        - --config=/etc/backend/config.yaml
      
      # Resource requests and limits
      resources:
        requests:
          cpu: "2"
          memory: 4Gi
        limits:
          cpu: "4"
          memory: 8Gi

Instance Templates

The template field defines the specification for each instance:

Machine Types

spec:
  template:
    spec:
      # GCP-style machine types
      machineType: e2-micro       # 2 vCPUs, 1 GB RAM
      # machineType: e2-small     # 2 vCPUs, 2 GB RAM
      # machineType: e2-medium    # 2 vCPUs, 4 GB RAM
      # machineType: n2-standard-4 # 4 vCPUs, 16 GB RAM
      # machineType: c2-standard-8 # 8 vCPUs, 32 GB RAM
Machine type names follow GCP conventions but work across providers. The Workload Operator translates them to provider-specific equivalents.

Image Types

Support for VM images and container images:
spec:
  template:
    spec:
      image: nginx:latest
      # Or from a private registry
      # image: gcr.io/my-project/app:v1.0
      
      # Image pull credentials
      imagePullSecrets:
        - name: gcr-credentials
Uses gVisor sandboxing for secure container execution.

Placement Rules

Control where instances are deployed using placement constraints:

Topology Spread

Distribute instances across zones or regions:
spec:
  placement:
    topologySpreadConstraints:
      # Spread across zones
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: web-app
      
      # Spread across providers
      - maxSkew: 2
        topologyKey: provider
        whenUnsatisfiable: ScheduleAnyway

Node Selector

Require specific location or provider:
spec:
  placement:
    nodeSelector:
      topology.kubernetes.io/region: us-central1
      topology.kubernetes.io/zone: us-central1-a
      provider: gcp

Affinity and Anti-Affinity

spec:
  placement:
    affinity:
      # Prefer certain providers
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            preference:
              matchExpressions:
                - key: provider
                  operator: In
                  values:
                    - gcp
                    - aws
      
      # Co-locate with other workloads
      podAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: database
            topologyKey: topology.kubernetes.io/zone
      
      # Keep away from other workloads
      podAntiAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: web-app
              topologyKey: kubernetes.io/hostname

Scaling

Manual Scaling

Change the replica count:
kubectl scale workload web-app --replicas=10
Or edit the resource:
kubectl edit workload web-app

Declarative Scaling

Update your YAML and reapply:
spec:
  replicas: 10  # Scaled from 3 to 10
kubectl apply -f workload.yaml
Automatic scaling via HPA (Horizontal Pod Autoscaler) is coming soon. Track progress in the enhancements repo.

Network Configuration

Attach instances to one or more networks:

Single Network

spec:
  template:
    spec:
      networkInterfaces:
        - networkRef:
            name: production-network

Multiple Networks

spec:
  template:
    spec:
      networkInterfaces:
        # Primary network (internet access)
        - networkRef:
            name: public-network
          primary: true
        
        # Backend network
        - networkRef:
            name: backend-network
          subnetRef:
            name: backend-subnet
        
        # Data network
        - networkRef:
            name: data-network

Static IP Assignment

spec:
  template:
    spec:
      networkInterfaces:
        - networkRef:
            name: production-network
          ipAddress: 10.0.1.100  # Static IP

Storage Volumes

Persistent Volumes

spec:
  template:
    spec:
      volumes:
        - name: app-data
          persistentVolumeClaim:
            claimName: app-data-pvc
        
        - name: config
          configMap:
            name: app-config
        
        - name: secrets
          secret:
            secretName: app-secrets
      
      volumeMounts:
        - name: app-data
          mountPath: /var/lib/app
        - name: config
          mountPath: /etc/app/config
          readOnly: true
        - name: secrets
          mountPath: /etc/app/secrets
          readOnly: true

Workload Lifecycle

How Workloads are Deployed

1

User creates Workload

kubectl apply -f workload.yaml
2

Workload Operator reconciles

  • Evaluates placement rules
  • Determines target locations (zones, providers)
  • Creates WorkloadDeployment for each location
3

WorkloadDeployment creates Instances

  • Generates Instance resources from template
  • Applies replica count per location
4

Infrastructure Plugin provisions

  • Watches for Instance resources
  • Creates VMs or containers in target provider
  • Attaches to networks
  • Mounts volumes
5

Status updates

  • Plugin updates Instance status to Running
  • Workload status becomes Ready

Workload Status

kubectl get workload web-app
NAME      REPLICAS   READY   AGE
web-app   3          3       5m
View detailed status:
kubectl describe workload web-app

Managing Workloads

Create a Workload

kubectl apply -f workload.yaml

List Workloads

kubectl get workloads

View Workload Details

kubectl describe workload web-app

Update a Workload

kubectl edit workload web-app
# Or
kubectl apply -f workload.yaml

Scale a Workload

kubectl scale workload web-app --replicas=5

Delete a Workload

kubectl delete workload web-app
Deleting a workload will terminate all instances. Data in non-persistent volumes will be lost.

View Instances

# List all instances
kubectl get instances

# Filter by workload
kubectl get instances -l workload=web-app

# Detailed instance info
kubectl describe instance web-app-us-central1-a-0

Troubleshooting

Workload not becoming Ready

# Check workload status
kubectl describe workload <workload-name>

# Check events
kubectl get events --field-selector involvedObject.name=<workload-name>

# Check Workload Operator logs
kubectl logs -n datum-system -l app=workload-operator

Instances not starting

# List instances
kubectl get instances

# Check specific instance
kubectl describe instance <instance-name>

# Check infrastructure plugin logs
kubectl logs -n datum-system -l app=infra-provider-gcp

Placement constraints not satisfied

# View placement events
kubectl describe workload <workload-name> | grep -A 20 Events

# Check available capacity
kubectl describe nodes

Best Practices

Use labels

Label workloads for organization and selection (app, tier, version, environment).

Spread for HA

Use topology spread constraints to distribute across zones.

Resource limits

Set resource requests and limits to ensure predictable performance.

Health checks

Configure liveness and readiness probes (coming soon).

Rolling updates

Use rolling update strategy for zero-downtime deployments.

Persistent data

Use persistent volumes for data that must survive instance restarts.

Common Patterns

Web Application Tier

apiVersion: compute.datumapis.com/v1alpha1
kind: Workload
metadata:
  name: web-frontend
spec:
  replicas: 5
  placement:
    topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
  template:
    spec:
      machineType: e2-medium
      image: nginx:alpine
      networkInterfaces:
        - networkRef:
            name: public-network

Background Worker

apiVersion: compute.datumapis.com/v1alpha1
kind: Workload
metadata:
  name: worker
spec:
  replicas: 3
  template:
    spec:
      machineType: n2-standard-2
      image: gcr.io/my-project/worker:latest
      env:
        - name: QUEUE_URL
          value: redis://redis-service:6379
      networkInterfaces:
        - networkRef:
            name: backend-network

Next Steps

Gateways

Expose workloads with the Gateway API

Networks

Configure networking for workloads

Managing Resources

Learn kubectl commands for workloads

Workload Operator

Explore the source code

Build docs developers (and LLMs) love