gRPC API Overview

Introduction

Talos Linux exposes a comprehensive gRPC API for all system operations. The API provides complete control over node lifecycle, configuration, monitoring, and cluster management. All talosctl commands interact with Talos nodes through this gRPC API.

API Services

The Talos API is organized into three main services:

MachineService

The primary service for node operations including:

Configuration management
System lifecycle (reboot, shutdown, upgrade)
Container and process management
System monitoring and stats
etcd cluster management
File operations

View MachineService documentation

ClusterService

Cluster-wide operations:

Health checks across multiple nodes
Cluster validation

View ClusterService documentation

InspectService

Internal inspection and debugging:

Controller runtime dependencies
Resource graphs

View InspectService documentation

Authentication

Talos uses mutual TLS (mTLS) for API authentication. Each API request must include a valid client certificate signed by the Talos CA.

Client Certificates

Client certificates are generated during cluster bootstrap and stored in the talosconfig file. The certificate includes:

Subject: Identifies the client
Roles: Defines permissions (os:admin, os:reader, etc.)
TTL: Certificate validity period (default: 365 days)

Generating Client Certificates

You can generate additional client certificates using the API:

import (
    "github.com/siderolabs/talos/pkg/machinery/api/machine"
    "google.golang.org/protobuf/types/known/durationpb"
)

client.GenerateClientConfiguration(ctx, &machine.GenerateClientConfigurationRequest{
    Roles:  []string{"os:admin"},
    CrtTtl: durationpb.New(24 * time.Hour),
})

Connection

Endpoints

The API is exposed on port 50000 by default. When connecting to a cluster, you can target:

Specific node: 10.0.0.1:50000
Control plane endpoint: Use the cluster endpoint from talosconfig
Load balanced: Through a load balancer (recommended for production)

Transport Security

All API communication uses TLS 1.3 with strong cipher suites:

TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305

Client Libraries

Official Go Client

The official Talos client library is written in Go:

import (
    "github.com/siderolabs/talos/pkg/machinery/client"
)

c, err := client.New(ctx,
    client.WithEndpoints("10.0.0.1"),
    client.WithTLSConfig(tlsConfig),
)
if err != nil {
    return err
}

defer c.Close()

// Call API methods
resp, err := c.Version(ctx)

Using talosctl as a Client

The talosctl CLI tool is built on the Go client library and can be used as a reference implementation:

# Get version
talosctl -n 10.0.0.1 version

# Apply configuration
talosctl -n 10.0.0.1 apply-config --file config.yaml

# Stream logs
talosctl -n 10.0.0.1 logs kubelet

Building Custom Clients

You can build clients in any language that supports gRPC:

Get the proto definitions: Clone the Talos repository
Generate code: Use protoc with your language plugin
Implement authentication: Load client certificate and CA
Create gRPC channel: Connect with TLS credentials

# Python example
import grpc
from machine_pb2_grpc import MachineServiceStub
from google.protobuf import empty_pb2

# Load certificates
with open('client.crt', 'rb') as f:
    client_cert = f.read()
with open('client.key', 'rb') as f:
    client_key = f.read()
with open('ca.crt', 'rb') as f:
    ca_cert = f.read()

# Create credentials
creds = grpc.ssl_channel_credentials(
    root_certificates=ca_cert,
    private_key=client_key,
    certificate_chain=client_cert
)

# Connect
with grpc.secure_channel('10.0.0.1:50000', creds) as channel:
    stub = MachineServiceStub(channel)
    response = stub.Version(empty_pb2.Empty())
    print(response)

Request/Response Patterns

Unary RPCs

Most API methods use unary request-response:

rpc Version(google.protobuf.Empty) returns (VersionResponse);

Server Streaming

Some methods stream data back to the client:

rpc Logs(LogsRequest) returns (stream common.Data);
rpc Events(EventsRequest) returns (stream Event);

Client Streaming

Upload operations use client streaming:

rpc EtcdRecover(stream common.Data) returns (EtcdRecoverResponse);

Common Types

All API responses include metadata and common types. See Common Types for details.

Error Handling

gRPC Status Codes

The API uses standard gRPC status codes:

OK (0): Success
CANCELLED (1): Operation cancelled
INVALID_ARGUMENT (3): Invalid request parameters
DEADLINE_EXCEEDED (4): Request timeout
NOT_FOUND (5): Resource not found
PERMISSION_DENIED (7): Insufficient permissions
UNAVAILABLE (14): Service unavailable

Error Details

Errors include additional context in the metadata:

{
  "metadata": {
    "hostname": "worker-1",
    "error": "service kubelet is not running",
    "status": {
      "code": 5,
      "message": "not found"
    }
  }
}

Multi-Node Requests

Many API calls can target multiple nodes simultaneously:

# Target multiple nodes
talosctl -n 10.0.0.1,10.0.0.2,10.0.0.3 version

# Use cluster endpoint (targets all control plane nodes)
talosctl version

Responses include metadata identifying which node responded:

message Metadata {
  string hostname = 1;
  string error = 2;
  google.rpc.Status status = 3;
}

API Versioning

The Talos API follows semantic versioning:

Major version: Breaking changes (reflected in proto package)
Minor version: Backward-compatible additions
Patch version: Backward-compatible fixes

Deprecation Policy

Deprecated methods include annotations indicating when they will be removed:

rpc ImageList(ImageListRequest) returns (stream ImageListResponse) {
  option (common.remove_deprecated_method) = "v1.18";
  option deprecated = true;
}

Rate Limiting

The API does not enforce rate limiting, but clients should:

Implement exponential backoff on errors
Avoid polling; use streaming RPCs where available
Batch operations when possible
Respect UNAVAILABLE status codes

Best Practices

Use Streaming for Real-Time Data

For logs, events, and monitoring data, use streaming RPCs instead of polling:

// Good: Use streaming
stream, err := c.Events(ctx, &machine.EventsRequest{TailEvents: 10})
for {
    event, err := stream.Recv()
    if err != nil {
        break
    }
    // Process event
}

// Bad: Don't poll
for {
    events, _ := c.GetEvents(ctx) // This doesn't exist
    time.Sleep(1 * time.Second)
}

Handle Partial Failures

When targeting multiple nodes, some may fail:

resp, err := c.Version(ctx)
for _, msg := range resp.Messages {
    if msg.Metadata.Error != "" {
        log.Printf("Node %s failed: %s", msg.Metadata.Hostname, msg.Metadata.Error)
        continue
    }
    // Process successful response
}

Set Appropriate Timeouts

Different operations have different time requirements:

// Quick operations: 30 seconds
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
c.Version(ctx)

// Long operations: 10 minutes
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
defer cancel()
c.Upgrade(ctx, &machine.UpgradeRequest{Image: "ghcr.io/siderolabs/talos:v1.7.0"})

CLI Reference

gRPC API

Configuration API

Introduction

API Services

MachineService

ClusterService

InspectService

Authentication

Client Certificates

Generating Client Certificates

Connection

Endpoints

Transport Security

Client Libraries

Official Go Client

Using talosctl as a Client

Building Custom Clients

Request/Response Patterns

Unary RPCs

Server Streaming

Client Streaming

Common Types

Error Handling

gRPC Status Codes

Error Details

Multi-Node Requests

API Versioning

Deprecation Policy

Rate Limiting

Best Practices

Use Streaming for Real-Time Data

Handle Partial Failures

Set Appropriate Timeouts

Next Steps

Build docs developers (and LLMs) love

CLI Reference

gRPC API

Configuration API

​Introduction

​API Services

​MachineService

​ClusterService

​InspectService

​Authentication

​Client Certificates

​Generating Client Certificates

​Connection

​Endpoints

​Transport Security

​Client Libraries

​Official Go Client

​Using talosctl as a Client

​Building Custom Clients

​Request/Response Patterns

​Unary RPCs

​Server Streaming

​Client Streaming

​Common Types

​Error Handling

​gRPC Status Codes

​Error Details

​Multi-Node Requests

​API Versioning

​Deprecation Policy

​Rate Limiting

​Best Practices

​Use Streaming for Real-Time Data

​Handle Partial Failures

​Set Appropriate Timeouts

​Next Steps

Build docs developers (and LLMs) love

Introduction

API Services

MachineService

ClusterService

InspectService

Authentication

Client Certificates

Generating Client Certificates

Connection

Endpoints

Transport Security

Client Libraries

Official Go Client

Using talosctl as a Client

Building Custom Clients

Request/Response Patterns

Unary RPCs

Server Streaming

Client Streaming

Common Types

Error Handling

gRPC Status Codes

Error Details

Multi-Node Requests

API Versioning

Deprecation Policy

Rate Limiting

Best Practices

Use Streaming for Real-Time Data

Handle Partial Failures

Set Appropriate Timeouts

Next Steps