The KloudMate Agent for Kubernetes deploys as both a DaemonSet (for node-level monitoring) and a Deployment (for cluster-level monitoring), with optional auto-instrumentation support via the OpenTelemetry Operator.
Prerequisites
Required
Kubernetes cluster version 1.19+
Helm 3.8+ installed
kubectl configured with cluster-admin permissions
API Key from KloudMate Settings
Optional
cert-manager (can be installed automatically as a dependency)
OpenTelemetry Operator (installed automatically with the Helm chart)
Quick Installation
Install CRD
The Instrumentation CRD must be installed before the Helm chart: kubectl apply -f https://raw.githubusercontent.com/kloudmate/km-agent/refs/heads/develop/deployment/helm/km-kube-agent/crds/crd-otel-instrumentation.yaml
Add Helm Repository
helm repo add kloudmate https://kloudmate.github.io/km-agent
helm repo update
Install the Agent
helm install kloudmate-release kloudmate/km-kube-agent \
--namespace km-agent --create-namespace \
--set API_KEY="<YOUR_API_KEY>" \
--set COLLECTOR_ENDPOINT="https://otel.kloudmate.com:4318" \
--set clusterName="my-cluster"
Replace <YOUR_API_KEY> with your actual API key and my-cluster with your cluster name.
Detailed Installation
Step 1: Install CRD (Required)
The Instrumentation Custom Resource Definition (CRD) enables auto-instrumentation capabilities:
kubectl apply -f https://raw.githubusercontent.com/kloudmate/km-agent/refs/heads/develop/deployment/helm/km-kube-agent/crds/crd-otel-instrumentation.yaml
Verify CRD installation:
kubectl get crd instrumentations.opentelemetry.io
Step 2: Add Helm Repository
helm repo add kloudmate https://kloudmate.github.io/km-agent
helm repo update
Verify the repository:
helm search repo kloudmate
Expected output:
NAME CHART VERSION APP VERSION DESCRIPTION
kloudmate/km-kube-agent 1.2.0 1.2.0 A Helm chart for Kloumate's Kubernetes agent
Step 3: Install with Helm
Basic Installation
helm install kloudmate-release kloudmate/km-kube-agent \
--namespace km-agent \
--create-namespace \
--set API_KEY="your-api-key" \
--set COLLECTOR_ENDPOINT="https://otel.kloudmate.com:4318" \
--set clusterName="production-cluster"
With Features Enabled
Enable APM (Application Performance Monitoring) and logs:
helm install kloudmate-release kloudmate/km-kube-agent \
--namespace km-agent \
--create-namespace \
--set API_KEY="your-api-key" \
--set COLLECTOR_ENDPOINT="https://otel.kloudmate.com:4318" \
--set clusterName="production-cluster" \
--set featuresEnabled.apm= true \
--set featuresEnabled.logs= true
With Namespace Monitoring
Monitor specific namespaces:
helm install kloudmate-release kloudmate/km-kube-agent \
--namespace km-agent \
--create-namespace \
--set API_KEY="your-api-key" \
--set COLLECTOR_ENDPOINT="https://otel.kloudmate.com:4318" \
--set clusterName="production-cluster" \
--set "monitoredNamespaces={bookinfo,mongodb,cassandra}" \
--set featuresEnabled.apm= true \
--set featuresEnabled.logs= true
For monitoredNamespaces, provide comma-separated namespace names without spaces.
Configuration Options
Required Parameters
Parameter Description Example API_KEYKloudMate authentication key km_xxx...COLLECTOR_ENDPOINTOTLP endpoint for data export https://otel.kloudmate.com:4318clusterNameUnique identifier for your cluster production-k8s
Feature Flags
Parameter Description Default featuresEnabled.metricsEnable metrics collection truefeaturesEnabled.tracesEnable trace collection truefeaturesEnabled.apmEnable application performance monitoring falsefeaturesEnabled.logsEnable log collection false
Example:
--set featuresEnabled.apm= true \
--set featuresEnabled.logs= true \
--set featuresEnabled.metrics= true \
--set featuresEnabled.traces= true
Parameter Description Default KM_UPDATE_ENDPOINTRemote config check endpoint https://api.kloudmate.com/agents/config-checkKM_CONFIG_CHECK_INTERVALConfig update check interval 30sKM_CFG_UPDATER_RPC_ADDRConfig updater RPC port 5501KM_XLOG_PATHSAdditional log paths []
Image Configuration
Parameter Description Default image.repositoryAgent image repository ghcr.io/kloudmate/km-kube-agentimage.tagImage tag latestimage.pullPolicyImage pull policy Always
Advanced Configuration
Node Taints and Tolerations
If your cluster uses node taints, configure tolerations:
helm install kloudmate-release kloudmate/km-kube-agent \
--namespace km-agent --create-namespace \
--set API_KEY="your-api-key" \
--set COLLECTOR_ENDPOINT="https://otel.kloudmate.com:4318" \
--set clusterName="my-cluster" \
--set tolerations[0].key="env" \
--set tolerations[0].operator="Equal" \
--set tolerations[0].value="production" \
--set tolerations[0].effect="NoSchedule" \
--set tolerations[1].key="workload" \
--set tolerations[1].operator="Equal" \
--set tolerations[1].value="monitoring" \
--set tolerations[1].effect="NoSchedule"
Default tolerations (automatically configured):
tolerations :
- key : "node.kubernetes.io/not-ready"
operator : "Exists"
effect : "NoExecute"
tolerationSeconds : 300
- key : "node.kubernetes.io/unreachable"
operator : "Exists"
effect : "NoExecute"
tolerationSeconds : 300
Node Affinity
Control pod scheduling preferences:
nodeAffinity :
preferredDuringSchedulingIgnoredDuringExecution :
- weight : 100
preference :
matchExpressions :
- key : kubernetes.io/os
operator : In
values :
- linux
- weight : 80
preference :
matchExpressions :
- key : node.kubernetes.io/instance-type
operator : NotIn
values :
- spot
Install with custom values file:
helm install kloudmate-release kloudmate/km-kube-agent \
--namespace km-agent --create-namespace \
-f values.yaml
Dependency Management
The chart includes optional dependencies:
dependencies :
- name : cert-manager
version : v1.18.2
repository : https://charts.jetstack.io
condition : cert-manager.enabled
- name : opentelemetry-operator
version : 0.90.0
repository : https://open-telemetry.github.io/opentelemetry-helm-charts
condition : opentelemetry-operator.enabled
Enable cert-manager (if not already installed):
helm install kloudmate-release kloudmate/km-kube-agent \
--set cert-manager.enabled= true \
--set cert-manager.crds.enabled= true \
...
Note : The OpenTelemetry Operator is enabled by default (opentelemetry-operator.enabled=true).
GKE Private Cluster Configuration
For private GKE clusters , you must configure firewall rules to allow webhook traffic.
The OpenTelemetry Operator uses admission webhooks on port 9443/tcp. Configure your firewall:
Option 1: Add Firewall Rule for Port 9443
gcloud compute firewall-rules create allow-webhook-traffic \
--network= < YOUR_VPC_NETWORK > \
--allow=tcp:9443 \
--source-ranges= < MASTER_CIDR > \
--target-tags= < NODE_TAGS > \
--description= "Allow webhook traffic for OpenTelemetry Operator"
Option 2: Update Existing Health Check Rule
Modify the existing rule that allows 80/tcp, 443/tcp, and 10254/tcp to also include 9443/tcp.
See:
Verification
Check Deployed Resources
# Check namespace
kubectl get ns km-agent
# Check DaemonSet (runs on each node)
kubectl get daemonset -n km-agent
# Check Deployment (cluster-level agent)
kubectl get deployment -n km-agent
# Check pods
kubectl get pods -n km-agent
# Check services
kubectl get svc -n km-agent
Expected output:
NAME READY STATUS RESTARTS AGE
km-agent-xxxxx 1/1 Running 0 2m
km-agent-cluster-xxxxxxxxx-yyyyy 1/1 Running 0 2m
km-fleet-manager-xxxxxxxxx-zzzzz 1/1 Running 0 2m
km-operator-xxxxxxxxx-aaaaa 1/1 Running 0 2m
Check Logs
# DaemonSet logs
kubectl logs -n km-agent daemonset/km-agent -f
# Deployment logs
kubectl logs -n km-agent deployment/km-agent-cluster -f
# Operator logs
kubectl logs -n km-agent deployment/km-operator -f
Verify Instrumentation CRD
kubectl get instrumentation -n km-agent
Upgrading
Update Helm Repository
helm repo update kloudmate
Upgrade Release
helm upgrade kloudmate-release kloudmate/km-kube-agent \
--namespace km-agent \
--reuse-values
Upgrade with New Values
helm upgrade kloudmate-release kloudmate/km-kube-agent \
--namespace km-agent \
--set featuresEnabled.apm= true \
--reuse-values
View Upgrade History
helm history kloudmate-release -n km-agent
Rollback
# Rollback to previous version
helm rollback kloudmate-release -n km-agent
# Rollback to specific revision
helm rollback kloudmate-release 2 -n km-agent
Configuration Management
Do not manually edit ConfigMaps for the DaemonSet or Deployment agents. Configurations may be overwritten by updates from KloudMate APIs.
Use the KloudMate Agent Config Editor (web-based YAML editor) to manage configurations:
Log in to KloudMate Dashboard
Navigate to Agents → Your Cluster
Use the Config Editor to modify OpenTelemetry configuration
Changes are automatically synchronized to agents
Configuration updates are checked every KM_CONFIG_CHECK_INTERVAL (default: 30s).
Uninstallation
Remove Helm Release
helm uninstall kloudmate-release -n km-agent
Delete Namespace
kubectl delete namespace km-agent
Remove CRD (Optional)
kubectl delete crd instrumentations.opentelemetry.io
Deleting the CRD will remove all Instrumentation resources across the cluster.
Troubleshooting
Pods Not Starting
Check pod status :
kubectl get pods -n km-agent
kubectl describe pod < pod-nam e > -n km-agent
Common causes :
Insufficient resources (CPU/memory)
Image pull errors (check image name and credentials)
Node taints preventing scheduling
Missing required environment variables
Webhook Errors (GKE Private Clusters)
Error: failed calling webhook "...": Post "https://...:443/...": context deadline exceeded
Solution : Add firewall rule for port 9443 (see GKE Private Cluster Configuration ).
DaemonSet Pods Not Scheduled
Check node taints :
kubectl describe nodes | grep -A 5 Taints
Solution : Add matching tolerations in Helm values.
Instrumentation Not Working
Check operator logs :
kubectl logs -n km-agent deployment/km-operator -f
Verify CRD installation :
kubectl get crd instrumentations.opentelemetry.io
Check namespace annotation :
kubectl get instrumentation -n < your-namespac e >
Configuration Not Updating
Check fleet manager logs :
kubectl logs -n km-agent deployment/km-fleet-manager -f
Verify connectivity :
kubectl exec -n km-agent < pod-nam e > -- wget -O- $KM_UPDATE_ENDPOINT
Check interval setting :
kubectl get deployment km-agent-cluster -n km-agent -o yaml | grep CONFIG_CHECK_INTERVAL
High Memory Usage
Check resource usage :
kubectl top pods -n km-agent
Set resource limits :
resources :
limits :
memory : 512Mi
cpu : 500m
requests :
memory : 256Mi
cpu : 100m
Apply with:
helm upgrade kloudmate-release kloudmate/km-kube-agent -f values.yaml -n km-agent
Values Reference
For a complete list of configurable values, see the values.yaml file.
Key configuration sections:
# Platform configuration
API_KEY : "" # REQUIRED
COLLECTOR_ENDPOINT : "https://otel.kloudmate.com:4318"
KM_UPDATE_ENDPOINT : "https://api.kloudmate.com/agents/config-check"
KM_CONFIG_CHECK_INTERVAL : "30s"
# Feature flags
featuresEnabled :
apm : false
logs : false
metrics : true
traces : true
# Cluster configuration
clusterName : "km-cluster"
monitoredNamespaces : []
# Image settings
image :
repository : ghcr.io/kloudmate/km-kube-agent
pullPolicy : Always
tag : "latest"
# Dependencies
opentelemetry-operator :
enabled : true
cert-manager :
enabled : false
Next Steps
Auto-Instrumentation Configure automatic application instrumentation
Namespace Monitoring Monitor specific namespaces and workloads
Configure Agent Customize OpenTelemetry receivers and processors
Verify Installation Confirm data is flowing to KloudMate
Support
For Kubernetes installation issues: