High Availability

High Availability (HA) support allows Tekton Pipelines components to remain operational when disruptions occur, such as nodes being drained for upgrades or instance failures.

Overview

Tekton Pipelines provides HA support for two main components:

Controller - Uses active/active model with distributed workqueue across buckets
Webhook - Stateless deployment that can be easily scaled and auto-scaled

By default, both components run with a single replica to reduce resource usage, effectively disabling HA.

Controller High Availability

The Controller achieves HA through an active/active model where all replicas can receive and process work items. The workqueue is distributed across buckets, with each replica owning a subset of those buckets.

Configuring Controller Replicas

To enable HA for the Controller, increase the replica count to more than one:

kubectl -n tekton-pipelines scale deployment tekton-pipelines-controller --replicas=3

Or modify the controller deployment directly in config/controller.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tekton-pipelines-controller
  namespace: tekton-pipelines
spec:
  replicas: 3
  # ... rest of configuration

Leader Election Configuration

Leader election is configured in the config-leader-election-controller ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: config-leader-election-controller
  namespace: tekton-pipelines
data:
  buckets: "1"
  lease-duration: "60s"
  renew-deadline: "40s"
  retry-period: "10s"

buckets

string

default:"1"

Number of buckets used to partition the key space of each Reconciler. If this number is M and the replica count is N, the N replicas compete for M buckets.

Maximum value is 10.

lease-duration

duration

default:"60s"

How long non-leaders wait before trying to acquire the lock. Core Kubernetes controllers use 15s.

renew-deadline

duration

default:"40s"

How long a leader will try to renew the lease before giving up. Core Kubernetes controllers use 10s.

retry-period

duration

default:"10s"

How long the leader election client waits between action attempts. Core Kubernetes controllers use 2s.

How Leader Election Works

The workqueue is divided into buckets based on the buckets configuration
Each controller replica competes to become the leader of specific buckets
The replica that owns a bucket processes all work items partitioned into that bucket
If a replica fails, other replicas can take over its buckets

Disabling Controller HA

To disable HA, scale back to one replica:

kubectl -n tekton-pipelines scale deployment tekton-pipelines-controller --replicas=1

Alternatively, set the disable-ha flag in the controller deployment:

spec:
  serviceAccountName: tekton-pipelines-controller
  containers:
    - name: tekton-pipelines-controller
      args:
        - "-disable-ha=true"
        # Other flags...

If you set -disable-ha=false and run multiple replicas, each replica will process work items separately, leading to unwanted behavior when creating resources. It’s recommended to simply run one replica instead of using the flag.

Webhook High Availability

The Webhook deployment is stateless, making it easier to configure for HA and enabling autoscaling based on load.

Configuring Webhook Replicas

Increase the number of webhook replicas:

kubectl -n tekton-pipelines scale deployment tekton-pipelines-webhook --replicas=3

Or modify the webhook deployment in config/webhook.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tekton-pipelines-webhook
  namespace: tekton-pipelines
spec:
  replicas: 3
  # ... rest of configuration

Horizontal Pod Autoscaling

Tekton Pipelines includes a HorizontalPodAutoscaler for the webhook:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: tekton-pipelines-webhook
  namespace: tekton-pipelines
spec:
  minReplicas: 1
  maxReplicas: 5
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: tekton-pipelines-webhook
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 100

To increase the minimum number of replicas:

kubectl -n tekton-pipelines patch hpa tekton-pipelines-webhook \
  --patch '{"spec":{"minReplicas":3}}'

The webhook requires a Metrics Server in your cluster for the HorizontalPodAutoscaler to function properly.

Avoiding Disruptions

To ensure minimum webhook availability during node disruptions, define a PodDisruptionBudget:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: tekton-pipelines-webhook
  namespace: tekton-pipelines
  labels:
    app.kubernetes.io/name: webhook
    app.kubernetes.io/component: webhook
    app.kubernetes.io/instance: default
    app.kubernetes.io/part-of: tekton-pipelines
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: webhook
      app.kubernetes.io/component: webhook
      app.kubernetes.io/instance: default
      app.kubernetes.io/part-of: tekton-pipelines

This ensures at least one webhook replica remains available during voluntary disruptions like node drains.

Pod Anti-Affinity

Webhook replicas are configured with pod anti-affinity by default to avoid scheduling all replicas on the same node:

spec:
  template:
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  app.kubernetes.io/name: webhook
                  app.kubernetes.io/component: webhook
              topologyKey: kubernetes.io/hostname
            weight: 100

This ensures that a single node failure doesn’t make all webhook replicas unavailable.

Cluster Autoscaler Considerations

By default, the webhook deployment is not configured to block the Cluster Autoscaler from scaling down nodes. During node drains, the webhook might become temporarily unavailable. To prevent this, either:

Add the safe-to-evict annotation:

spec:
  template:
    metadata:
      annotations:
        cluster-autoscaler.kubernetes.io/safe-to-evict: "false"

Configure multiple webhook replicas (recommended approach)

Prerequisites for HA

Metrics Server

High concurrency scenarios and webhook autoscaling require a Metrics Server:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Verify the Metrics Server is running:

kubectl get deployment metrics-server -n kube-system

Complete HA Configuration Example

# Controller with 3 replicas
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tekton-pipelines-controller
  namespace: tekton-pipelines
spec:
  replicas: 3
  selector:
    matchLabels:
      app.kubernetes.io/name: controller
  template:
    spec:
      serviceAccountName: tekton-pipelines-controller
      containers:
      - name: tekton-pipelines-controller
        image: gcr.io/tekton-releases/github.com/tektoncd/pipeline/cmd/controller:latest
---
# Leader election with 10 buckets
apiVersion: v1
kind: ConfigMap
metadata:
  name: config-leader-election-controller
  namespace: tekton-pipelines
data:
  buckets: "10"
  lease-duration: "60s"
  renew-deadline: "40s"
  retry-period: "10s"
---
# Webhook with 3 replicas
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tekton-pipelines-webhook
  namespace: tekton-pipelines
spec:
  replicas: 3
  selector:
    matchLabels:
      app.kubernetes.io/name: webhook
  template:
    spec:
      serviceAccountName: tekton-pipelines-webhook
      containers:
      - name: webhook
        image: gcr.io/tekton-releases/github.com/tektoncd/pipeline/cmd/webhook:latest
---
# Webhook HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: tekton-pipelines-webhook
  namespace: tekton-pipelines
spec:
  minReplicas: 3
  maxReplicas: 10
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: tekton-pipelines-webhook
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 80
---
# Webhook PDB
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: tekton-pipelines-webhook
  namespace: tekton-pipelines
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app.kubernetes.io/name: webhook
      app.kubernetes.io/component: webhook

Verification

Verify your HA configuration:

# Check controller replicas
kubectl get deployment tekton-pipelines-controller -n tekton-pipelines

# Check webhook replicas
kubectl get deployment tekton-pipelines-webhook -n tekton-pipelines

# Check HPA status
kubectl get hpa -n tekton-pipelines

# Check PDB status
kubectl get pdb -n tekton-pipelines

# View leader election leases
kubectl get lease -n tekton-pipelines

Best Practices

Start with 3 replicas for both controller and webhook in production environments
Configure PodDisruptionBudgets to maintain availability during cluster maintenance
Use HPA for webhooks to handle variable load automatically
Monitor metrics to tune replica counts and resource requests/limits
Increase bucket count to 10 when running many controller replicas for better load distribution
Test failover by draining nodes or deleting pods to verify HA behavior
Set appropriate resource requests/limits to ensure pods can be scheduled across multiple nodes

Get Started

Core Concepts

Guides

Advanced Features

Resolvers

Operations

Migration

Overview

Controller High Availability

Configuring Controller Replicas

Leader Election Configuration

How Leader Election Works

Disabling Controller HA

Webhook High Availability

Configuring Webhook Replicas

Horizontal Pod Autoscaling

Avoiding Disruptions

Pod Anti-Affinity

Cluster Autoscaler Considerations

Prerequisites for HA

Metrics Server

Complete HA Configuration Example

Verification

Best Practices

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Advanced Features

Resolvers

Operations

Migration

​Overview

​Controller High Availability

​Configuring Controller Replicas

​Leader Election Configuration

​How Leader Election Works

​Disabling Controller HA

​Webhook High Availability

​Configuring Webhook Replicas

​Horizontal Pod Autoscaling

​Avoiding Disruptions

​Pod Anti-Affinity

​Cluster Autoscaler Considerations

​Prerequisites for HA

​Metrics Server

​Complete HA Configuration Example

​Verification

​Best Practices

Build docs developers (and LLMs) love

Overview

Controller High Availability

Configuring Controller Replicas

Leader Election Configuration

How Leader Election Works

Disabling Controller HA

Webhook High Availability

Configuring Webhook Replicas

Horizontal Pod Autoscaling

Avoiding Disruptions

Pod Anti-Affinity

Cluster Autoscaler Considerations

Prerequisites for HA

Metrics Server

Complete HA Configuration Example

Verification

Best Practices