Talos Linux provides native support for Google Cloud Platform (GCP) with automatic metadata discovery and platform-specific optimizations.
Overview
The GCP platform integration includes:
- Automatic network configuration via metadata service
- GCE-specific kernel parameters and console configuration
- Google Cloud DNS and NTP server configuration
- Instance metadata (region, zone, machine type, preemptible detection)
- User-data based configuration delivery via instance metadata
Prerequisites
- Google Cloud account with Compute Engine API enabled
gcloud CLI configured
talosctl installed
- (Optional) Terraform for infrastructure automation
Quick Start with Custom Image
Upload Talos Image
Download Talos Image
Download the GCP image:curl -LO https://github.com/siderolabs/talos/releases/latest/download/gcp-amd64.raw.tar.gz
Create GCS Bucket
Create a bucket for images:gsutil mb gs://talos-images-${PROJECT_ID}
Upload Image
Upload to Cloud Storage:gsutil cp gcp-amd64.raw.tar.gz gs://talos-images-${PROJECT_ID}/
Create Image
Create a compute image:gcloud compute images create talos-latest \
--source-uri=gs://talos-images-${PROJECT_ID}/gcp-amd64.raw.tar.gz \
--guest-os-features=VIRTIO_SCSI_MULTIQUEUE,UEFI_COMPATIBLE
Launch Instances
Generate Configuration
Create machine configuration:talosctl gen config my-cluster \
https://api.my-cluster.example.com:6443
Create Control Plane
Launch control plane instance:gcloud compute instances create talos-cp-1 \
--image=talos-latest \
--machine-type=n2-standard-2 \
--boot-disk-size=50GB \
--network=default \
--metadata-from-file=user-data=controlplane.yaml \
--zone=us-central1-a
Get Instance IP
Retrieve the external IP:INSTANCE_IP=$(gcloud compute instances describe talos-cp-1 \
--zone=us-central1-a \
--format='get(networkInterfaces[0].accessConfigs[0].natIP)')
Bootstrap Cluster
Bootstrap Kubernetes:talosctl bootstrap --nodes $INSTANCE_IP
talosctl kubeconfig --nodes $INSTANCE_IP
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 5.0"
}
}
}
provider "google" {
project = var.project_id
region = var.region
}
# VPC Network
resource "google_compute_network" "talos" {
name = "talos-network"
auto_create_subnetworks = false
}
resource "google_compute_subnetwork" "talos" {
name = "talos-subnet"
ip_cidr_range = "10.0.0.0/24"
network = google_compute_network.talos.id
region = var.region
}
# Control plane instances
resource "google_compute_instance" "control_plane" {
count = 3
name = "talos-cp-${count.index + 1}"
machine_type = "n2-standard-2"
zone = "${var.region}-a"
boot_disk {
initialize_params {
image = google_compute_image.talos.self_link
size = 50
type = "pd-ssd"
}
}
network_interface {
subnetwork = google_compute_subnetwork.talos.id
access_config {
// Ephemeral public IP
}
}
metadata = {
user-data = file("controlplane.yaml")
}
labels = {
role = "control-plane"
}
tags = ["talos-control-plane"]
}
# Worker instances
resource "google_compute_instance" "worker" {
count = 3
name = "talos-worker-${count.index + 1}"
machine_type = "n2-standard-4"
zone = "${var.region}-a"
boot_disk {
initialize_params {
image = google_compute_image.talos.self_link
size = 100
type = "pd-ssd"
}
}
network_interface {
subnetwork = google_compute_subnetwork.talos.id
access_config {
// Ephemeral public IP
}
}
metadata = {
user-data = file("worker.yaml")
}
labels = {
role = "worker"
}
tags = ["talos-worker"]
}
# Load balancer for control plane
resource "google_compute_address" "control_plane" {
name = "talos-cp-lb-ip"
}
resource "google_compute_forwarding_rule" "control_plane" {
name = "talos-cp-lb"
target = google_compute_target_pool.control_plane.id
port_range = "6443"
ip_address = google_compute_address.control_plane.address
}
resource "google_compute_target_pool" "control_plane" {
name = "talos-cp-pool"
instances = [
for instance in google_compute_instance.control_plane :
instance.self_link
]
health_checks = [
google_compute_http_health_check.control_plane.name
]
}
resource "google_compute_http_health_check" "control_plane" {
name = "talos-cp-health"
port = 6443
request_path = "/readyz"
check_interval_sec = 5
timeout_sec = 3
}
Network Configuration
Talos queries the GCP metadata service:
// From internal/app/machined/pkg/runtime/v1alpha1/platform/gcp/gcp.go
func (g *GCP) NetworkConfiguration(ctx context.Context, st state.State, ch chan<- *runtime.PlatformNetworkConfig) error {
metadata, err := g.getMetadata(ctx)
if err != nil {
return fmt.Errorf("failed to receive GCP metadata: %w", err)
}
networkConfig := &runtime.PlatformNetworkConfig{
Resolvers: []network.ResolverSpecSpec{{
DNSServers: []netip.Addr{netip.MustParseAddr("169.254.169.254")},
ConfigLayer: network.ConfigPlatform,
}},
TimeServers: []network.TimeServerSpecSpec{{
NTPServers: []string{"metadata.google.internal"},
ConfigLayer: network.ConfigPlatform,
}},
}
// ...
}
Automatically discovered:
- Platform:
gcp
- Project ID:
my-project
- Region:
us-central1
- Zone:
us-central1-a
- Machine Type:
n2-standard-2
- Instance ID:
1234567890
- Provider ID:
gce://my-project/us-central1-a/instance-name
- Preemptible:
true/false
Kernel Arguments
GCP instances use specific configuration:
console=ttyS0 net.ifnames=0 talos.dashboard.disabled=1 sysctl.kernel.kexec_load_disabled=1
Kexec is disabled on GCP as VMs sometimes hang during kexec operations.
Configuration
Machine Configuration Patches
machine:
kubelet:
extraArgs:
cloud-provider: external
nodeIP:
validSubnets:
- 10.0.0.0/8
cluster:
externalCloudProvider:
enabled: true
manifests:
- https://raw.githubusercontent.com/kubernetes/cloud-provider-gcp/master/deploy/cloud-controller-manager.yaml
Service Account
Create a service account with required permissions:
gcloud iam service-accounts create talos-cluster \
--display-name="Talos Cluster Service Account"
gcloud projects add-iam-policy-binding ${PROJECT_ID} \
--member="serviceAccount:talos-cluster@${PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/compute.admin"
gcloud projects add-iam-policy-binding ${PROJECT_ID} \
--member="serviceAccount:talos-cluster@${PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/iam.serviceAccountUser"
Attach to instances:
gcloud compute instances create talos-cp-1 \
--service-account=talos-cluster@${PROJECT_ID}.iam.gserviceaccount.com \
--scopes=cloud-platform
Storage
Persistent Disks
Use GCP Persistent Disks:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pd-claim
spec:
accessModes:
- ReadWriteOnce
storageClassName: pd-ssd
resources:
requests:
storage: 100Gi
GCS Integration
Configure GCS for backups:
cluster:
etcd:
extraArgs:
snapshot-count: "10000"
# Configure external backup to GCS
IPv6 Support
GCP supports dual-stack networking:
// Automatic IPv6 configuration
for _, ipv6addr := range iface.IPv6 {
if ipv6addr != "" && iface.GatewayIPv6 != "" {
// Configure IPv6 address and route
}
}
Troubleshooting
# From instance
curl -H "Metadata-Flavor: Google" \
http://metadata.google.internal/computeMetadata/v1/
talosctl get platformmetadata --nodes <node-ip>
Serial Console
Access serial console:
gcloud compute connect-to-serial-port talos-cp-1 --zone=us-central1-a
Logs
talosctl logs machined --nodes <node-ip>
Next Steps
Cloud Controller
Configure GCP Cloud Controller Manager
Load Balancers
Set up GCP load balancing