Skip to main content
The KloudMate Agent for Kubernetes deploys as both a DaemonSet (for node-level monitoring) and a Deployment (for cluster-level monitoring), with optional auto-instrumentation support via the OpenTelemetry Operator.

Prerequisites

Required

  • Kubernetes cluster version 1.19+
  • Helm 3.8+ installed
  • kubectl configured with cluster-admin permissions
  • API Key from KloudMate Settings

Optional

  • cert-manager (can be installed automatically as a dependency)
  • OpenTelemetry Operator (installed automatically with the Helm chart)

Quick Installation

1

Install CRD

The Instrumentation CRD must be installed before the Helm chart:
kubectl apply -f https://raw.githubusercontent.com/kloudmate/km-agent/refs/heads/develop/deployment/helm/km-kube-agent/crds/crd-otel-instrumentation.yaml
2

Add Helm Repository

helm repo add kloudmate https://kloudmate.github.io/km-agent
helm repo update
3

Install the Agent

helm install kloudmate-release kloudmate/km-kube-agent \
  --namespace km-agent --create-namespace \
  --set API_KEY="<YOUR_API_KEY>" \
  --set COLLECTOR_ENDPOINT="https://otel.kloudmate.com:4318" \
  --set clusterName="my-cluster"
Replace <YOUR_API_KEY> with your actual API key and my-cluster with your cluster name.

Detailed Installation

Step 1: Install CRD (Required)

The Instrumentation Custom Resource Definition (CRD) enables auto-instrumentation capabilities:
kubectl apply -f https://raw.githubusercontent.com/kloudmate/km-agent/refs/heads/develop/deployment/helm/km-kube-agent/crds/crd-otel-instrumentation.yaml
Verify CRD installation:
kubectl get crd instrumentations.opentelemetry.io

Step 2: Add Helm Repository

helm repo add kloudmate https://kloudmate.github.io/km-agent
helm repo update
Verify the repository:
helm search repo kloudmate
Expected output:
NAME                         CHART VERSION  APP VERSION  DESCRIPTION
kloudmate/km-kube-agent      1.2.0          1.2.0        A Helm chart for Kloumate's Kubernetes agent

Step 3: Install with Helm

Basic Installation

helm install kloudmate-release kloudmate/km-kube-agent \
  --namespace km-agent \
  --create-namespace \
  --set API_KEY="your-api-key" \
  --set COLLECTOR_ENDPOINT="https://otel.kloudmate.com:4318" \
  --set clusterName="production-cluster"

With Features Enabled

Enable APM (Application Performance Monitoring) and logs:
helm install kloudmate-release kloudmate/km-kube-agent \
  --namespace km-agent \
  --create-namespace \
  --set API_KEY="your-api-key" \
  --set COLLECTOR_ENDPOINT="https://otel.kloudmate.com:4318" \
  --set clusterName="production-cluster" \
  --set featuresEnabled.apm=true \
  --set featuresEnabled.logs=true

With Namespace Monitoring

Monitor specific namespaces:
helm install kloudmate-release kloudmate/km-kube-agent \
  --namespace km-agent \
  --create-namespace \
  --set API_KEY="your-api-key" \
  --set COLLECTOR_ENDPOINT="https://otel.kloudmate.com:4318" \
  --set clusterName="production-cluster" \
  --set "monitoredNamespaces={bookinfo,mongodb,cassandra}" \
  --set featuresEnabled.apm=true \
  --set featuresEnabled.logs=true
For monitoredNamespaces, provide comma-separated namespace names without spaces.

Configuration Options

Required Parameters

ParameterDescriptionExample
API_KEYKloudMate authentication keykm_xxx...
COLLECTOR_ENDPOINTOTLP endpoint for data exporthttps://otel.kloudmate.com:4318
clusterNameUnique identifier for your clusterproduction-k8s

Feature Flags

ParameterDescriptionDefault
featuresEnabled.metricsEnable metrics collectiontrue
featuresEnabled.tracesEnable trace collectiontrue
featuresEnabled.apmEnable application performance monitoringfalse
featuresEnabled.logsEnable log collectionfalse
Example:
--set featuresEnabled.apm=true \
--set featuresEnabled.logs=true \
--set featuresEnabled.metrics=true \
--set featuresEnabled.traces=true

Platform Configuration

ParameterDescriptionDefault
KM_UPDATE_ENDPOINTRemote config check endpointhttps://api.kloudmate.com/agents/config-check
KM_CONFIG_CHECK_INTERVALConfig update check interval30s
KM_CFG_UPDATER_RPC_ADDRConfig updater RPC port5501
KM_XLOG_PATHSAdditional log paths[]

Image Configuration

ParameterDescriptionDefault
image.repositoryAgent image repositoryghcr.io/kloudmate/km-kube-agent
image.tagImage taglatest
image.pullPolicyImage pull policyAlways

Advanced Configuration

Node Taints and Tolerations

If your cluster uses node taints, configure tolerations:
helm install kloudmate-release kloudmate/km-kube-agent \
  --namespace km-agent --create-namespace \
  --set API_KEY="your-api-key" \
  --set COLLECTOR_ENDPOINT="https://otel.kloudmate.com:4318" \
  --set clusterName="my-cluster" \
  --set tolerations[0].key="env" \
  --set tolerations[0].operator="Equal" \
  --set tolerations[0].value="production" \
  --set tolerations[0].effect="NoSchedule" \
  --set tolerations[1].key="workload" \
  --set tolerations[1].operator="Equal" \
  --set tolerations[1].value="monitoring" \
  --set tolerations[1].effect="NoSchedule"
Default tolerations (automatically configured):
tolerations:
  - key: "node.kubernetes.io/not-ready"
    operator: "Exists"
    effect: "NoExecute"
    tolerationSeconds: 300
  - key: "node.kubernetes.io/unreachable"
    operator: "Exists"
    effect: "NoExecute"
    tolerationSeconds: 300

Node Affinity

Control pod scheduling preferences:
values.yaml
nodeAffinity:
  preferredDuringSchedulingIgnoredDuringExecution:
  - weight: 100
    preference:
      matchExpressions:
      - key: kubernetes.io/os
        operator: In
        values:
        - linux
  - weight: 80
    preference:
      matchExpressions:
      - key: node.kubernetes.io/instance-type
        operator: NotIn
        values:
        - spot
Install with custom values file:
helm install kloudmate-release kloudmate/km-kube-agent \
  --namespace km-agent --create-namespace \
  -f values.yaml

Dependency Management

The chart includes optional dependencies:
Chart.yaml
dependencies:
  - name: cert-manager
    version: v1.18.2
    repository: https://charts.jetstack.io
    condition: cert-manager.enabled
  - name: opentelemetry-operator
    version: 0.90.0
    repository: https://open-telemetry.github.io/opentelemetry-helm-charts
    condition: opentelemetry-operator.enabled
Enable cert-manager (if not already installed):
helm install kloudmate-release kloudmate/km-kube-agent \
  --set cert-manager.enabled=true \
  --set cert-manager.crds.enabled=true \
  ...
Note: The OpenTelemetry Operator is enabled by default (opentelemetry-operator.enabled=true).

GKE Private Cluster Configuration

For private GKE clusters, you must configure firewall rules to allow webhook traffic.
The OpenTelemetry Operator uses admission webhooks on port 9443/tcp. Configure your firewall:

Option 1: Add Firewall Rule for Port 9443

gcloud compute firewall-rules create allow-webhook-traffic \
  --network=<YOUR_VPC_NETWORK> \
  --allow=tcp:9443 \
  --source-ranges=<MASTER_CIDR> \
  --target-tags=<NODE_TAGS> \
  --description="Allow webhook traffic for OpenTelemetry Operator"

Option 2: Update Existing Health Check Rule

Modify the existing rule that allows 80/tcp, 443/tcp, and 10254/tcp to also include 9443/tcp. See:

Verification

Check Deployed Resources

# Check namespace
kubectl get ns km-agent

# Check DaemonSet (runs on each node)
kubectl get daemonset -n km-agent

# Check Deployment (cluster-level agent)
kubectl get deployment -n km-agent

# Check pods
kubectl get pods -n km-agent

# Check services
kubectl get svc -n km-agent
Expected output:
NAME                              READY   STATUS    RESTARTS   AGE
km-agent-xxxxx                    1/1     Running   0          2m
km-agent-cluster-xxxxxxxxx-yyyyy  1/1     Running   0          2m
km-fleet-manager-xxxxxxxxx-zzzzz  1/1     Running   0          2m
km-operator-xxxxxxxxx-aaaaa       1/1     Running   0          2m

Check Logs

# DaemonSet logs
kubectl logs -n km-agent daemonset/km-agent -f

# Deployment logs
kubectl logs -n km-agent deployment/km-agent-cluster -f

# Operator logs
kubectl logs -n km-agent deployment/km-operator -f

Verify Instrumentation CRD

kubectl get instrumentation -n km-agent

Upgrading

Update Helm Repository

helm repo update kloudmate

Upgrade Release

helm upgrade kloudmate-release kloudmate/km-kube-agent \
  --namespace km-agent \
  --reuse-values

Upgrade with New Values

helm upgrade kloudmate-release kloudmate/km-kube-agent \
  --namespace km-agent \
  --set featuresEnabled.apm=true \
  --reuse-values

View Upgrade History

helm history kloudmate-release -n km-agent

Rollback

# Rollback to previous version
helm rollback kloudmate-release -n km-agent

# Rollback to specific revision
helm rollback kloudmate-release 2 -n km-agent

Configuration Management

Do not manually edit ConfigMaps for the DaemonSet or Deployment agents. Configurations may be overwritten by updates from KloudMate APIs.
Use the KloudMate Agent Config Editor (web-based YAML editor) to manage configurations:
  1. Log in to KloudMate Dashboard
  2. Navigate to AgentsYour Cluster
  3. Use the Config Editor to modify OpenTelemetry configuration
  4. Changes are automatically synchronized to agents
Configuration updates are checked every KM_CONFIG_CHECK_INTERVAL (default: 30s).

Uninstallation

Remove Helm Release

helm uninstall kloudmate-release -n km-agent

Delete Namespace

kubectl delete namespace km-agent

Remove CRD (Optional)

kubectl delete crd instrumentations.opentelemetry.io
Deleting the CRD will remove all Instrumentation resources across the cluster.

Troubleshooting

Pods Not Starting

Check pod status:
kubectl get pods -n km-agent
kubectl describe pod <pod-name> -n km-agent
Common causes:
  • Insufficient resources (CPU/memory)
  • Image pull errors (check image name and credentials)
  • Node taints preventing scheduling
  • Missing required environment variables

Webhook Errors (GKE Private Clusters)

Error: failed calling webhook "...": Post "https://...:443/...": context deadline exceeded
Solution: Add firewall rule for port 9443 (see GKE Private Cluster Configuration).

DaemonSet Pods Not Scheduled

Check node taints:
kubectl describe nodes | grep -A 5 Taints
Solution: Add matching tolerations in Helm values.

Instrumentation Not Working

Check operator logs:
kubectl logs -n km-agent deployment/km-operator -f
Verify CRD installation:
kubectl get crd instrumentations.opentelemetry.io
Check namespace annotation:
kubectl get instrumentation -n <your-namespace>

Configuration Not Updating

Check fleet manager logs:
kubectl logs -n km-agent deployment/km-fleet-manager -f
Verify connectivity:
kubectl exec -n km-agent <pod-name> -- wget -O- $KM_UPDATE_ENDPOINT
Check interval setting:
kubectl get deployment km-agent-cluster -n km-agent -o yaml | grep CONFIG_CHECK_INTERVAL

High Memory Usage

Check resource usage:
kubectl top pods -n km-agent
Set resource limits:
values.yaml
resources:
  limits:
    memory: 512Mi
    cpu: 500m
  requests:
    memory: 256Mi
    cpu: 100m
Apply with:
helm upgrade kloudmate-release kloudmate/km-kube-agent -f values.yaml -n km-agent

Values Reference

For a complete list of configurable values, see the values.yaml file. Key configuration sections:
values.yaml (excerpt)
# Platform configuration
API_KEY: ""  # REQUIRED
COLLECTOR_ENDPOINT: "https://otel.kloudmate.com:4318"
KM_UPDATE_ENDPOINT: "https://api.kloudmate.com/agents/config-check"
KM_CONFIG_CHECK_INTERVAL: "30s"

# Feature flags
featuresEnabled:
  apm: false
  logs: false
  metrics: true
  traces: true

# Cluster configuration
clusterName: "km-cluster"
monitoredNamespaces: []

# Image settings
image:
  repository: ghcr.io/kloudmate/km-kube-agent
  pullPolicy: Always
  tag: "latest"

# Dependencies
opentelemetry-operator:
  enabled: true
cert-manager:
  enabled: false

Next Steps

Auto-Instrumentation

Configure automatic application instrumentation

Namespace Monitoring

Monitor specific namespaces and workloads

Configure Agent

Customize OpenTelemetry receivers and processors

Verify Installation

Confirm data is flowing to KloudMate

Support

For Kubernetes installation issues:

Build docs developers (and LLMs) love