Skip to main content
The kubeagent is a Kubernetes-native OpenTelemetry collector agent designed to run as pods or daemonsets in Kubernetes clusters. It uses environment variables and ConfigMaps for configuration, with automatic in-cluster authentication.

Overview

Unlike kmagent, the kubeagent has no CLI commands or flags. It’s designed to run as a Kubernetes workload and reads all configuration from environment variables. Key features:
  • Automatic in-cluster Kubernetes authentication
  • ConfigMap-based collector configuration
  • Support for both Deployment and DaemonSet modes
  • Graceful signal handling (SIGINT, SIGTERM)
  • Structured JSON logging

Deployment Modes

The agent supports two deployment modes, each with its own configuration file:

Deployment

Single instance for cluster-wide metrics and logs. Uses /etc/kmagent/agent-deployment.yaml

DaemonSet

One instance per node for host metrics and node-level telemetry. Uses /etc/kmagent/agent-daemonset.yaml
From internal/k8sagent/agent.go:160-168:
func (c *K8sAgent) otelConfigPath() string {
	daemonsetURI := "/etc/kmagent/agent-daemonset.yaml"
	deploymentURI := "/etc/kmagent/agent-deployment.yaml"
	if c.Cfg.DeploymentMode == "DAEMONSET" {
		return daemonsetURI
	} else {
		return deploymentURI
	}
}

Configuration

All configuration is provided via environment variables. The agent reads environment variables on startup and validates required fields.

Environment Variables

KM_API_KEY
string
required
API key for authenticating with KloudMate services. Required for remote configuration and telemetry export.Example:
- name: KM_API_KEY
  valueFrom:
    secretKeyRef:
      name: kloudmate-secret
      key: api-key
KM_COLLECTOR_ENDPOINT
string
required
OpenTelemetry exporter endpoint where telemetry data is sent. Must be a valid HTTP/HTTPS URL.Example:
- name: KM_COLLECTOR_ENDPOINT
  value: "https://otel.kloudmate.com:4318"
CONFIGMAP_NAME
string
required
Name of the ConfigMap containing the OpenTelemetry collector configuration.Example:
- name: CONFIGMAP_NAME
  value: "kmagent-config"
POD_NAMESPACE
string
required
Namespace where the agent pod is running. Typically injected using the downward API.Example:
- name: POD_NAMESPACE
  valueFrom:
    fieldRef:
      fieldPath: metadata.namespace
DEPLOYMENT_MODE
string
default:"DEPLOYMENT"
Deployment mode for the agent. Valid values: DEPLOYMENT, DAEMONSET (case-insensitive).Example:
- name: DEPLOYMENT_MODE
  value: "DAEMONSET"
KM_CONFIG_CHECK_INTERVAL
string
Interval for checking configuration updates. Format depends on implementation.Example:
- name: KM_CONFIG_CHECK_INTERVAL
  value: "60"

Configuration Loading

From internal/k8sagent/agent.go:125-142:
func NewK8sConfig() *K8sConfig {
	config := &K8sConfig{
		ConfigCheckInterval: os.Getenv("KM_CONFIG_CHECK_INTERVAL"),
		APIKey:              os.Getenv("KM_API_KEY"),
		CollectorEndpoint:   os.Getenv("KM_COLLECTOR_ENDPOINT"),
		ConfigMapName:       os.Getenv("CONFIGMAP_NAME"),
		DeploymentMode:      os.Getenv("DEPLOYMENT_MODE"),
		PodNamespace:        os.Getenv("POD_NAMESPACE"),
	}

	if strings.ToUpper(config.DeploymentMode) == "DAEMONSET" {
		config.DeploymentMode = "DAEMONSET"
	} else {
		config.DeploymentMode = "DEPLOYMENT"

	}
	return config
}

Validation

The agent validates required configuration on startup: From internal/k8sagent/agent.go:144-158:
func (c *K8sConfig) Validate() error {
	if c.APIKey == "" {
		return fmt.Errorf("KM_API_KEY is required")
	}
	if c.CollectorEndpoint == "" {
		return fmt.Errorf("KM_COLLECTOR_ENDPOINT is required")
	}
	if c.ConfigMapName == "" {
		return fmt.Errorf("CONFIGMAP_NAME is required")
	}
	if c.PodNamespace == "" {
		return fmt.Errorf("POD_NAMESPACE is required")
	}
	return nil
}

Kubernetes Authentication

The agent automatically uses in-cluster Kubernetes configuration: From internal/k8sagent/agent.go:60-69:
kubecfg, err := rest.InClusterConfig()
if err != nil {
	return nil, fmt.Errorf("failed to load in-cluster config: %w", err)
}
logger.Info("loaded in-cluster kubernetes config")

k8sClient, err := kubernetes.NewForConfig(kubecfg)
if err != nil {
	return nil, fmt.Errorf("failed to create kubernetes client: %w", err)
}
The agent must run inside a Kubernetes cluster with appropriate RBAC permissions to read ConfigMaps in its namespace.

Agent Initialization

The agent initializes with version information and logs startup details: From internal/k8sagent/agent.go:49-87:
func NewK8sAgent(info *AgentInfo) (*K8sAgent, error) {
	zapCfg := zap.NewProductionConfig()
	zapCfg.Level = zap.NewAtomicLevelAt(kmlogger.ParseLogLevel())
	zapLogger, err := zapCfg.Build()
	if err != nil {
		return nil, fmt.Errorf("failed to initialize logger: %w", err)
	}
	logger := zapLogger.Sugar()

	cfg := NewK8sConfig()

	kubecfg, err := rest.InClusterConfig()
	if err != nil {
		return nil, fmt.Errorf("failed to load in-cluster config: %w", err)
	}
	logger.Info("loaded in-cluster kubernetes config")

	k8sClient, err := kubernetes.NewForConfig(kubecfg)
	if err != nil {
		return nil, fmt.Errorf("failed to create kubernetes client: %w", err)
	}

	agent := &K8sAgent{
		Cfg:       cfg,
		Logger:    logger,
		K8sClient: k8sClient,
		AgentInfo: *info,
		stopCh:    make(chan struct{}),
	}
	agent.AgentInfo.setEnvForAgentVersion()
	agent.AgentInfo.CollectorVersion = version.GetCollectorVersion()

	logger.Infow("kube agent initialized",
		"version", info.Version,
		"commit", info.CommitSHA,
		"deploymentMode", cfg.DeploymentMode,
	)
	return agent, nil
}

Signal Handling

The agent handles OS signals for graceful shutdown: From cmd/kubeagent/main.go:46-57:
// handleSignals sets up a signal handler to gracefully shut down the agent.
func handleSignals(cancelFunc context.CancelFunc, agent *k8sagent.K8sAgent) {
	sigChan := make(chan os.Signal, 1)
	signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)

	go func() {
		sig := <-sigChan
		agent.Logger.Warnf("Received signal %s, initiating shutdown...", sig)
		cancelFunc()
		agent.Stop()
	}()
}
Handled signals:
  • SIGINT (Ctrl+C)
  • SIGTERM (Kubernetes pod termination)

Main Loop

The agent’s main function is minimal, delegating to the agent implementation: From cmd/kubeagent/main.go:18-44:
func main() {
	// Set up the main context for the application
	appCtx, cancelAppCtx := context.WithCancel(context.Background())
	defer cancelAppCtx()

	agent, err := k8sagent.NewK8sAgent(&k8sagent.AgentInfo{Version: version, CommitSHA: commit})
	if err != nil {
		log.Fatal(err)
	}

	// Handle OS signals for graceful shutdown.
	handleSignals(cancelAppCtx, agent)

	if err = agent.StartAgent(appCtx); err != nil {
		agent.Logger.Errorf("agent could not be started with current config : %s", err.Error())
	}

	agent.AwaitShutdown()

	defer func() {
		// Ensure logger is synced before exit to flush buffered logs.
		if syncErr := agent.Logger.Sync(); syncErr != nil && syncErr.Error() != "sync /dev/stdout: invalid argument" {
			agent.Logger.Warnf("Failed to sync logger: %v\n", syncErr)
		}
	}()
}

Deployment Example

Deployment Mode

Deploy as a single instance for cluster-wide telemetry:
deployment.yaml
apiVersion: v1
kind: Secret
metadata:
  name: kloudmate-secret
  namespace: kloudmate
type: Opaque
stringData:
  api-key: "km_xxxxxxxxxxxxx"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: kmagent-config
  namespace: kloudmate
data:
  agent-deployment.yaml: |
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
          http:
            endpoint: 0.0.0.0:4318
    
    processors:
      batch:
        timeout: 10s
        send_batch_size: 1024
    
    exporters:
      otlphttp:
        endpoint: ${env:KM_COLLECTOR_ENDPOINT}
        headers:
          api-key: ${env:KM_API_KEY}
    
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlphttp]
        metrics:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlphttp]
        logs:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlphttp]
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kubeagent
  namespace: kloudmate
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kubeagent
  template:
    metadata:
      labels:
        app: kubeagent
    spec:
      serviceAccountName: kubeagent
      containers:
      - name: kubeagent
        image: kloudmate/kubeagent:latest
        env:
        - name: KM_API_KEY
          valueFrom:
            secretKeyRef:
              name: kloudmate-secret
              key: api-key
        - name: KM_COLLECTOR_ENDPOINT
          value: "https://otel.kloudmate.com:4318"
        - name: DEPLOYMENT_MODE
          value: "DEPLOYMENT"
        - name: CONFIGMAP_NAME
          value: "kmagent-config"
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: KM_CONFIG_CHECK_INTERVAL
          value: "60"
        volumeMounts:
        - name: config
          mountPath: /etc/kmagent
      volumes:
      - name: config
        configMap:
          name: kmagent-config
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kubeagent
  namespace: kloudmate
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: kubeagent
  namespace: kloudmate
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kubeagent
  namespace: kloudmate
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubeagent
subjects:
- kind: ServiceAccount
  name: kubeagent
  namespace: kloudmate
Deploy:
kubectl apply -f deployment.yaml

DaemonSet Mode

Deploy as a DaemonSet for per-node telemetry:
daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kubeagent
  namespace: kloudmate
spec:
  selector:
    matchLabels:
      app: kubeagent
  template:
    metadata:
      labels:
        app: kubeagent
    spec:
      serviceAccountName: kubeagent
      hostNetwork: true
      hostPID: true
      containers:
      - name: kubeagent
        image: kloudmate/kubeagent:latest
        env:
        - name: KM_API_KEY
          valueFrom:
            secretKeyRef:
              name: kloudmate-secret
              key: api-key
        - name: KM_COLLECTOR_ENDPOINT
          value: "https://otel.kloudmate.com:4318"
        - name: DEPLOYMENT_MODE
          value: "DAEMONSET"
        - name: CONFIGMAP_NAME
          value: "kmagent-config"
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        volumeMounts:
        - name: config
          mountPath: /etc/kmagent
        - name: hostfs
          mountPath: /hostfs
          readOnly: true
        securityContext:
          privileged: true
      volumes:
      - name: config
        configMap:
          name: kmagent-config
      - name: hostfs
        hostPath:
          path: /

RBAC Permissions

The agent requires minimal RBAC permissions:
rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kubeagent
  namespace: kloudmate
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: kubeagent
  namespace: kloudmate
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "watch", "list"]
  resourceNames: ["kmagent-config"]  # Restrict to specific ConfigMap
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kubeagent
  namespace: kloudmate
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubeagent
subjects:
- kind: ServiceAccount
  name: kubeagent
  namespace: kloudmate
For cluster-wide monitoring (e.g., Kubernetes metrics receiver), you may need ClusterRole permissions to watch nodes, pods, and other cluster resources.

Logging

The agent uses structured JSON logging with configurable log levels:
zapCfg := zap.NewProductionConfig()
zapCfg.Level = zap.NewAtomicLevelAt(kmlogger.ParseLogLevel())
Log output:
  • Format: JSON with structured fields
  • Level: Configurable via environment variable
  • Output: stdout (captured by Kubernetes)
Example log entry:
{
  "level": "info",
  "ts": "2024-03-15T10:30:45.123Z",
  "msg": "kube agent initialized",
  "version": "0.1.0",
  "commit": "abc123",
  "deploymentMode": "DEPLOYMENT"
}

Version Information

The agent tracks version information and exposes it for telemetry: From cmd/kubeagent/main.go:13-16:
var (
	version = "0.1.0"
	commit  = "none"
)
From internal/k8sagent/agent.go:170-173:
// setEnvForAgentVersion sets agent version on env for otel processor
func (r *AgentInfo) setEnvForAgentVersion() {
	os.Setenv("KM_AGENT_VERSION", r.Version)
}
The KM_AGENT_VERSION environment variable can be referenced in OpenTelemetry processor configuration to tag telemetry with agent version.

Graceful Shutdown

The agent implements graceful shutdown: From internal/k8sagent/agent.go:108-115:
// Stop stops the underlying collector.
func (a *K8sAgent) Stop() {
	a.Logger.Info("stopping collector agent")
	close(a.stopCh)
	a.wg.Wait()
	a.stopInternalCollector()
	a.Logger.Info("collector agent stopped")
}
Shutdown sequence:
  1. Receive signal (SIGINT or SIGTERM)
  2. Cancel application context
  3. Close stop channel to signal goroutines
  4. Wait for all goroutines to complete
  5. Stop internal collector
  6. Sync logger to flush buffered logs

Troubleshooting

Check Pod Status

kubectl get pods -n kloudmate -l app=kubeagent
kubectl describe pod -n kloudmate -l app=kubeagent

View Logs

kubectl logs -n kloudmate -l app=kubeagent --tail=100 -f

Verify Configuration

# Check ConfigMap
kubectl get configmap -n kloudmate kmagent-config -o yaml

# Check Secret
kubectl get secret -n kloudmate kloudmate-secret -o yaml

# Verify environment variables
kubectl exec -n kloudmate deployment/kubeagent -- env | grep KM_

Common Issues

Ensure the pod is running inside a Kubernetes cluster and has a service account:
kubectl get serviceaccount -n kloudmate kubeagent
Verify the secret exists and is properly mounted:
kubectl get secret -n kloudmate kloudmate-secret
kubectl describe pod -n kloudmate -l app=kubeagent | grep -A 5 Environment
Ensure the ConfigMap exists and the CONFIGMAP_NAME environment variable is set:
kubectl get configmap -n kloudmate kmagent-config
kubectl exec -n kloudmate deployment/kubeagent -- env | grep CONFIGMAP_NAME
Verify RBAC permissions:
kubectl get role -n kloudmate kubeagent -o yaml
kubectl get rolebinding -n kloudmate kubeagent -o yaml
kubectl auth can-i get configmaps --as=system:serviceaccount:kloudmate:kubeagent -n kloudmate

Configuration Updates

Update the collector configuration by modifying the ConfigMap:
kubectl edit configmap -n kloudmate kmagent-config
The agent will automatically detect changes and reload the configuration based on the KM_CONFIG_CHECK_INTERVAL setting.
Configuration updates may require pod restart depending on the implementation. Check the agent logs for reload behavior.

Health Checks

Add liveness and readiness probes to your deployment:
livenessProbe:
  httpGet:
    path: /health
    port: 13133
  initialDelaySeconds: 30
  periodSeconds: 10
readinessProbe:
  httpGet:
    path: /ready
    port: 13133
  initialDelaySeconds: 5
  periodSeconds: 5
The health check port (13133) is the default OpenTelemetry Collector health extension port. Enable the health_check extension in your collector configuration.

See Also

CLI Overview

Overview of all CLI tools and common configuration

Kubernetes Deployment

Deploy kubeagent to Kubernetes clusters

Configuration

OpenTelemetry collector configuration reference

RBAC Setup

Configure Kubernetes RBAC permissions

Build docs developers (and LLMs) love