kubeagent

The kubeagent is a Kubernetes-native OpenTelemetry collector agent designed to run as pods or daemonsets in Kubernetes clusters. It uses environment variables and ConfigMaps for configuration, with automatic in-cluster authentication.

Overview

Unlike kmagent, the kubeagent has no CLI commands or flags. It’s designed to run as a Kubernetes workload and reads all configuration from environment variables. Key features:

Automatic in-cluster Kubernetes authentication
ConfigMap-based collector configuration
Support for both Deployment and DaemonSet modes
Graceful signal handling (SIGINT, SIGTERM)
Structured JSON logging

Deployment Modes

The agent supports two deployment modes, each with its own configuration file:

Deployment

Single instance for cluster-wide metrics and logs. Uses /etc/kmagent/agent-deployment.yaml

DaemonSet

One instance per node for host metrics and node-level telemetry. Uses /etc/kmagent/agent-daemonset.yaml

From internal/k8sagent/agent.go:160-168:

func (c *K8sAgent) otelConfigPath() string {
	daemonsetURI := "/etc/kmagent/agent-daemonset.yaml"
	deploymentURI := "/etc/kmagent/agent-deployment.yaml"
	if c.Cfg.DeploymentMode == "DAEMONSET" {
		return daemonsetURI
	} else {
		return deploymentURI
	}
}

Configuration

All configuration is provided via environment variables. The agent reads environment variables on startup and validates required fields.

Environment Variables

KM_API_KEY

string

required

API key for authenticating with KloudMate services. Required for remote configuration and telemetry export.Example:

- name: KM_API_KEY
  valueFrom:
    secretKeyRef:
      name: kloudmate-secret
      key: api-key

KM_COLLECTOR_ENDPOINT

string

required

OpenTelemetry exporter endpoint where telemetry data is sent. Must be a valid HTTP/HTTPS URL.Example:

- name: KM_COLLECTOR_ENDPOINT
  value: "https://otel.kloudmate.com:4318"

CONFIGMAP_NAME

string

required

Name of the ConfigMap containing the OpenTelemetry collector configuration.Example:

- name: CONFIGMAP_NAME
  value: "kmagent-config"

POD_NAMESPACE

string

required

Namespace where the agent pod is running. Typically injected using the downward API.Example:

- name: POD_NAMESPACE
  valueFrom:
    fieldRef:
      fieldPath: metadata.namespace

DEPLOYMENT_MODE

string

default:"DEPLOYMENT"

Deployment mode for the agent. Valid values: DEPLOYMENT, DAEMONSET (case-insensitive).Example:

- name: DEPLOYMENT_MODE
  value: "DAEMONSET"

KM_CONFIG_CHECK_INTERVAL

string

Interval for checking configuration updates. Format depends on implementation.Example:

- name: KM_CONFIG_CHECK_INTERVAL
  value: "60"

Configuration Loading

From internal/k8sagent/agent.go:125-142:

func NewK8sConfig() *K8sConfig {
	config := &K8sConfig{
		ConfigCheckInterval: os.Getenv("KM_CONFIG_CHECK_INTERVAL"),
		APIKey:              os.Getenv("KM_API_KEY"),
		CollectorEndpoint:   os.Getenv("KM_COLLECTOR_ENDPOINT"),
		ConfigMapName:       os.Getenv("CONFIGMAP_NAME"),
		DeploymentMode:      os.Getenv("DEPLOYMENT_MODE"),
		PodNamespace:        os.Getenv("POD_NAMESPACE"),
	}

	if strings.ToUpper(config.DeploymentMode) == "DAEMONSET" {
		config.DeploymentMode = "DAEMONSET"
	} else {
		config.DeploymentMode = "DEPLOYMENT"

	}
	return config
}

Validation

The agent validates required configuration on startup: From internal/k8sagent/agent.go:144-158:

func (c *K8sConfig) Validate() error {
	if c.APIKey == "" {
		return fmt.Errorf("KM_API_KEY is required")
	}
	if c.CollectorEndpoint == "" {
		return fmt.Errorf("KM_COLLECTOR_ENDPOINT is required")
	}
	if c.ConfigMapName == "" {
		return fmt.Errorf("CONFIGMAP_NAME is required")
	}
	if c.PodNamespace == "" {
		return fmt.Errorf("POD_NAMESPACE is required")
	}
	return nil
}

Kubernetes Authentication

The agent automatically uses in-cluster Kubernetes configuration: From internal/k8sagent/agent.go:60-69:

kubecfg, err := rest.InClusterConfig()
if err != nil {
	return nil, fmt.Errorf("failed to load in-cluster config: %w", err)
}
logger.Info("loaded in-cluster kubernetes config")

k8sClient, err := kubernetes.NewForConfig(kubecfg)
if err != nil {
	return nil, fmt.Errorf("failed to create kubernetes client: %w", err)
}

The agent must run inside a Kubernetes cluster with appropriate RBAC permissions to read ConfigMaps in its namespace.

Agent Initialization

The agent initializes with version information and logs startup details: From internal/k8sagent/agent.go:49-87:

func NewK8sAgent(info *AgentInfo) (*K8sAgent, error) {
	zapCfg := zap.NewProductionConfig()
	zapCfg.Level = zap.NewAtomicLevelAt(kmlogger.ParseLogLevel())
	zapLogger, err := zapCfg.Build()
	if err != nil {
		return nil, fmt.Errorf("failed to initialize logger: %w", err)
	}
	logger := zapLogger.Sugar()

	cfg := NewK8sConfig()

	kubecfg, err := rest.InClusterConfig()
	if err != nil {
		return nil, fmt.Errorf("failed to load in-cluster config: %w", err)
	}
	logger.Info("loaded in-cluster kubernetes config")

	k8sClient, err := kubernetes.NewForConfig(kubecfg)
	if err != nil {
		return nil, fmt.Errorf("failed to create kubernetes client: %w", err)
	}

	agent := &K8sAgent{
		Cfg:       cfg,
		Logger:    logger,
		K8sClient: k8sClient,
		AgentInfo: *info,
		stopCh:    make(chan struct{}),
	}
	agent.AgentInfo.setEnvForAgentVersion()
	agent.AgentInfo.CollectorVersion = version.GetCollectorVersion()

	logger.Infow("kube agent initialized",
		"version", info.Version,
		"commit", info.CommitSHA,
		"deploymentMode", cfg.DeploymentMode,
	)
	return agent, nil
}

Signal Handling

The agent handles OS signals for graceful shutdown: From cmd/kubeagent/main.go:46-57:

// handleSignals sets up a signal handler to gracefully shut down the agent.
func handleSignals(cancelFunc context.CancelFunc, agent *k8sagent.K8sAgent) {
	sigChan := make(chan os.Signal, 1)
	signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)

	go func() {
		sig := <-sigChan
		agent.Logger.Warnf("Received signal %s, initiating shutdown...", sig)
		cancelFunc()
		agent.Stop()
	}()
}

Handled signals:

SIGINT (Ctrl+C)
SIGTERM (Kubernetes pod termination)

Main Loop

The agent’s main function is minimal, delegating to the agent implementation: From cmd/kubeagent/main.go:18-44:

func main() {
	// Set up the main context for the application
	appCtx, cancelAppCtx := context.WithCancel(context.Background())
	defer cancelAppCtx()

	agent, err := k8sagent.NewK8sAgent(&k8sagent.AgentInfo{Version: version, CommitSHA: commit})
	if err != nil {
		log.Fatal(err)
	}

	// Handle OS signals for graceful shutdown.
	handleSignals(cancelAppCtx, agent)

	if err = agent.StartAgent(appCtx); err != nil {
		agent.Logger.Errorf("agent could not be started with current config : %s", err.Error())
	}

	agent.AwaitShutdown()

	defer func() {
		// Ensure logger is synced before exit to flush buffered logs.
		if syncErr := agent.Logger.Sync(); syncErr != nil && syncErr.Error() != "sync /dev/stdout: invalid argument" {
			agent.Logger.Warnf("Failed to sync logger: %v\n", syncErr)
		}
	}()
}

Deployment Example

Deployment Mode

Deploy as a single instance for cluster-wide telemetry:

deployment.yaml

apiVersion: v1
kind: Secret
metadata:
  name: kloudmate-secret
  namespace: kloudmate
type: Opaque
stringData:
  api-key: "km_xxxxxxxxxxxxx"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: kmagent-config
  namespace: kloudmate
data:
  agent-deployment.yaml: |
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
          http:
            endpoint: 0.0.0.0:4318
    
    processors:
      batch:
        timeout: 10s
        send_batch_size: 1024
    
    exporters:
      otlphttp:
        endpoint: ${env:KM_COLLECTOR_ENDPOINT}
        headers:
          api-key: ${env:KM_API_KEY}
    
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlphttp]
        metrics:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlphttp]
        logs:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlphttp]
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kubeagent
  namespace: kloudmate
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kubeagent
  template:
    metadata:
      labels:
        app: kubeagent
    spec:
      serviceAccountName: kubeagent
      containers:
      - name: kubeagent
        image: kloudmate/kubeagent:latest
        env:
        - name: KM_API_KEY
          valueFrom:
            secretKeyRef:
              name: kloudmate-secret
              key: api-key
        - name: KM_COLLECTOR_ENDPOINT
          value: "https://otel.kloudmate.com:4318"
        - name: DEPLOYMENT_MODE
          value: "DEPLOYMENT"
        - name: CONFIGMAP_NAME
          value: "kmagent-config"
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: KM_CONFIG_CHECK_INTERVAL
          value: "60"
        volumeMounts:
        - name: config
          mountPath: /etc/kmagent
      volumes:
      - name: config
        configMap:
          name: kmagent-config
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kubeagent
  namespace: kloudmate
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: kubeagent
  namespace: kloudmate
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kubeagent
  namespace: kloudmate
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubeagent
subjects:
- kind: ServiceAccount
  name: kubeagent
  namespace: kloudmate

Deploy:

kubectl apply -f deployment.yaml

DaemonSet Mode

Deploy as a DaemonSet for per-node telemetry:

daemonset.yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kubeagent
  namespace: kloudmate
spec:
  selector:
    matchLabels:
      app: kubeagent
  template:
    metadata:
      labels:
        app: kubeagent
    spec:
      serviceAccountName: kubeagent
      hostNetwork: true
      hostPID: true
      containers:
      - name: kubeagent
        image: kloudmate/kubeagent:latest
        env:
        - name: KM_API_KEY
          valueFrom:
            secretKeyRef:
              name: kloudmate-secret
              key: api-key
        - name: KM_COLLECTOR_ENDPOINT
          value: "https://otel.kloudmate.com:4318"
        - name: DEPLOYMENT_MODE
          value: "DAEMONSET"
        - name: CONFIGMAP_NAME
          value: "kmagent-config"
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        volumeMounts:
        - name: config
          mountPath: /etc/kmagent
        - name: hostfs
          mountPath: /hostfs
          readOnly: true
        securityContext:
          privileged: true
      volumes:
      - name: config
        configMap:
          name: kmagent-config
      - name: hostfs
        hostPath:
          path: /

RBAC Permissions

The agent requires minimal RBAC permissions:

rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: kubeagent
  namespace: kloudmate
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: kubeagent
  namespace: kloudmate
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "watch", "list"]
  resourceNames: ["kmagent-config"]  # Restrict to specific ConfigMap
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kubeagent
  namespace: kloudmate
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubeagent
subjects:
- kind: ServiceAccount
  name: kubeagent
  namespace: kloudmate

For cluster-wide monitoring (e.g., Kubernetes metrics receiver), you may need ClusterRole permissions to watch nodes, pods, and other cluster resources.

Logging

The agent uses structured JSON logging with configurable log levels:

zapCfg := zap.NewProductionConfig()
zapCfg.Level = zap.NewAtomicLevelAt(kmlogger.ParseLogLevel())

Log output:

Format: JSON with structured fields
Level: Configurable via environment variable
Output: stdout (captured by Kubernetes)

Example log entry:

{
  "level": "info",
  "ts": "2024-03-15T10:30:45.123Z",
  "msg": "kube agent initialized",
  "version": "0.1.0",
  "commit": "abc123",
  "deploymentMode": "DEPLOYMENT"
}

Version Information

The agent tracks version information and exposes it for telemetry: From cmd/kubeagent/main.go:13-16:

var (
	version = "0.1.0"
	commit  = "none"
)

From internal/k8sagent/agent.go:170-173:

// setEnvForAgentVersion sets agent version on env for otel processor
func (r *AgentInfo) setEnvForAgentVersion() {
	os.Setenv("KM_AGENT_VERSION", r.Version)
}

The KM_AGENT_VERSION environment variable can be referenced in OpenTelemetry processor configuration to tag telemetry with agent version.

Graceful Shutdown

The agent implements graceful shutdown: From internal/k8sagent/agent.go:108-115:

// Stop stops the underlying collector.
func (a *K8sAgent) Stop() {
	a.Logger.Info("stopping collector agent")
	close(a.stopCh)
	a.wg.Wait()
	a.stopInternalCollector()
	a.Logger.Info("collector agent stopped")
}

Shutdown sequence:

Receive signal (SIGINT or SIGTERM)
Cancel application context
Close stop channel to signal goroutines
Wait for all goroutines to complete
Stop internal collector
Sync logger to flush buffered logs

Troubleshooting

Check Pod Status

kubectl get pods -n kloudmate -l app=kubeagent
kubectl describe pod -n kloudmate -l app=kubeagent

View Logs

kubectl logs -n kloudmate -l app=kubeagent --tail=100 -f

Verify Configuration

# Check ConfigMap
kubectl get configmap -n kloudmate kmagent-config -o yaml

# Check Secret
kubectl get secret -n kloudmate kloudmate-secret -o yaml

# Verify environment variables
kubectl exec -n kloudmate deployment/kubeagent -- env | grep KM_

Common Issues

Failed to load in-cluster config

Ensure the pod is running inside a Kubernetes cluster and has a service account:

kubectl get serviceaccount -n kloudmate kubeagent

KM_API_KEY is required

Verify the secret exists and is properly mounted:

kubectl get secret -n kloudmate kloudmate-secret
kubectl describe pod -n kloudmate -l app=kubeagent | grep -A 5 Environment

CONFIGMAP_NAME is required

Ensure the ConfigMap exists and the CONFIGMAP_NAME environment variable is set:

kubectl get configmap -n kloudmate kmagent-config
kubectl exec -n kloudmate deployment/kubeagent -- env | grep CONFIGMAP_NAME

Permission denied accessing ConfigMap

Verify RBAC permissions:

kubectl get role -n kloudmate kubeagent -o yaml
kubectl get rolebinding -n kloudmate kubeagent -o yaml
kubectl auth can-i get configmaps --as=system:serviceaccount:kloudmate:kubeagent -n kloudmate

Configuration Updates

Update the collector configuration by modifying the ConfigMap:

kubectl edit configmap -n kloudmate kmagent-config

The agent will automatically detect changes and reload the configuration based on the KM_CONFIG_CHECK_INTERVAL setting.

Configuration updates may require pod restart depending on the implementation. Check the agent logs for reload behavior.

Health Checks

Add liveness and readiness probes to your deployment:

livenessProbe:
  httpGet:
    path: /health
    port: 13133
  initialDelaySeconds: 30
  periodSeconds: 10
readinessProbe:
  httpGet:
    path: /ready
    port: 13133
  initialDelaySeconds: 5
  periodSeconds: 5

The health check port (13133) is the default OpenTelemetry Collector health extension port. Enable the health_check extension in your collector configuration.

CLI Overview

Overview of all CLI tools and common configuration

Kubernetes Deployment

Deploy kubeagent to Kubernetes clusters

Configuration

OpenTelemetry collector configuration reference

RBAC Setup

Configure Kubernetes RBAC permissions

CLI Commands

Configuration API

Overview

Deployment Modes

Deployment

DaemonSet

Configuration

Environment Variables

Configuration Loading

Validation

Kubernetes Authentication

Agent Initialization

Signal Handling

Main Loop

Deployment Example

Deployment Mode

DaemonSet Mode

RBAC Permissions

Logging

Version Information

Graceful Shutdown

Troubleshooting

Check Pod Status

View Logs

Verify Configuration

Common Issues

Configuration Updates

Health Checks

See Also

CLI Overview

Kubernetes Deployment

Configuration

RBAC Setup

Build docs developers (and LLMs) love

CLI Commands

Configuration API

​Overview

​Deployment Modes

Deployment

DaemonSet

​Configuration

​Environment Variables

​Configuration Loading

​Validation

​Kubernetes Authentication

​Agent Initialization

​Signal Handling

​Main Loop

​Deployment Example

​Deployment Mode

​DaemonSet Mode

​RBAC Permissions

​Logging

​Version Information

​Graceful Shutdown

​Troubleshooting

​Check Pod Status

​View Logs

​Verify Configuration

​Common Issues

​Configuration Updates

​Health Checks

​See Also

CLI Overview

Kubernetes Deployment

Configuration

RBAC Setup

Build docs developers (and LLMs) love

Overview

Deployment Modes

Configuration

Environment Variables

Configuration Loading

Validation

Kubernetes Authentication

Agent Initialization

Signal Handling

Main Loop

Deployment Example

Deployment Mode

DaemonSet Mode

RBAC Permissions

Logging

Version Information

Graceful Shutdown

Troubleshooting

Check Pod Status

View Logs

Verify Configuration

Common Issues

Configuration Updates

Health Checks

See Also