Skip to main content
The KloudMate Agent is built on the OpenTelemetry Collector and supports a rich set of receivers, processors, and exporters. This guide covers the default configurations and customization options.

Pipeline Architecture

The collector uses a pipeline architecture where telemetry data flows through three stages:

Receivers

Collect telemetry from various sources

Processors

Transform, filter, and enrich data

Exporters

Send data to backends

Default Configurations

The agent includes three optimized configurations based on deployment mode:

Host Configuration

For bare metal and VM deployments (host-col-config.yaml):
receivers:
  hostmetrics:
    collection_interval: 60s
    scrapers:
      cpu:
        metrics:
          system.cpu.utilization:
            enabled: true
      load:
        cpu_average: true
      memory:
        metrics:
          system.memory.utilization:
            enabled: true
      filesystem:
        metrics:
          system.filesystem.usage:
            enabled: true
          system.filesystem.utilization:
            enabled: true
      disk: {}
      network: {}
  
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

Docker Configuration

For Docker deployments (docker-col-config.yaml):
receivers:
  docker_stats:
    endpoint: "unix:///var/run/docker.sock"
    collection_interval: 60s
    timeout: 10s
  
  filelog:
    include:
      - /var/log/**/*.log
      - /var/lib/docker/containers/**/*.log*
      - ${env:FILELOG_PATHS}
    exclude:
      - /var/log/kmagent*/**/*.log
      - '**/*.gz'
      - '**/*.zip'
      - '**/*.tar'
    include_file_name_resolved: true
    include_file_path: true
    max_log_size: "1MiB"
  
  hostmetrics:
    collection_interval: 60s
    root_path: /hostfs
    scrapers:
      cpu:
        metrics:
          system.cpu.utilization:
            enabled: true
      memory:
        metrics:
          system.memory.utilization:
            enabled: true
      disk:
      network:
      filesystem:
        exclude_fs_types:
          fs_types:
            - autofs
            - overlay
            - tmpfs
            - devtmpfs
          match_type: strict

Kubernetes Configuration

For Kubernetes DaemonSet deployments (daemonset-col-config.yaml):
receivers:
  kubeletstats:
    auth_type: serviceAccount
    collection_interval: 30s
    endpoint: ${env:KM_NODE_NAME}:10250
    insecure_skip_verify: true
    k8s_api_config:
      auth_type: serviceAccount
    metric_groups:
      - volume
      - node
      - pod
      - container
    metrics:
      k8s.container.cpu_limit_utilization:
        enabled: true
      k8s.container.cpu_request_utilization:
        enabled: true
      k8s.container.memory_limit_utilization:
        enabled: true
      k8s.pod.cpu_limit_utilization:
        enabled: true
      k8s.pod.memory_request_utilization:
        enabled: true
  
  filelog/containers:
    include:
      - /var/log/pods/*/*/*.log
    exclude:
      - /**/*.gz
      - /var/log/pods/km-agent_*/**/*.log
      - ${env:KM_XLOG_PATHS:-/___no_exclude___}
    include_file_name_resolved: true
    max_log_size: 1MiB
    poll_interval: 10s
    operators:
      # Parse container log format
      - id: container-parser
        type: container
      
      # Recombine multiline logs
      - id: recombine-multiline
        type: recombine
        combine_field: body
        is_first_entry: body matches "^(\\d{4}-\\d{2}-\\d{2}[T ]\\d{2}:\\d{2}:\\d{2}|\\{)"
        combine_with: "\n"
        max_log_size: 1048576
      
      # Parse JSON logs
      - id: parser-json
        type: json_parser
        parse_from: body
        parse_to: attributes.parsed_json
        on_error: send

Receivers

Host Metrics Receiver

Collects system metrics from the host:
collection_interval
duration
default:"60s"
How often to collect metrics.
root_path
string
default:"/"
Root path for metrics collection. Use /hostfs when running in containers.
scrapers
object
Which metric types to collect:
  • cpu: CPU utilization and frequency
  • memory: Memory usage and utilization
  • disk: Disk I/O operations
  • filesystem: Filesystem usage and utilization
  • network: Network I/O and errors
  • load: System load average
  • processes: Process count
  • process: Per-process metrics

OTLP Receiver

Receives telemetry via OTLP protocol:
otlp:
  protocols:
    grpc:
      endpoint: 0.0.0.0:4317
    http:
      endpoint: 0.0.0.0:4318
Use cases:
  • Receive traces from instrumented applications
  • Forward data from other collectors
  • Accept metrics from Prometheus exporters (via OTLP conversion)

Kubelet Stats Receiver

Collects Kubernetes metrics from the kubelet API:
endpoint
string
required
Kubelet endpoint. Use ${env:KM_NODE_NAME}:10250 for DaemonSets.
auth_type
string
default:"serviceAccount"
Authentication method. Use serviceAccount for in-cluster.
metric_groups
array
Which metric groups to collect: volume, node, pod, container

Filelog Receiver

Collects logs from files:
include
array
required
File patterns to include (glob format).
exclude
array
File patterns to exclude.
operators
array
Log parsing operators (regex, json, multiline, etc.).
Example operators from Kubernetes configuration:
operators:
  # Parse container format: timestamp + stream + log
  - id: container-parser
    type: container
  
  # Recombine multiline logs (stack traces)
  - id: recombine-multiline
    type: recombine
    combine_field: body
    is_first_entry: body matches "^(\\d{4}-\\d{2}-\\d{2}[T ]\\d{2}:\\d{2}:\\d{2}|\\{)"
  
  # Try JSON parsing
  - id: parser-json
    type: json_parser
    parse_from: body
    on_error: send
  
  # Extract timestamp from JSON
  - id: extract-timestamp-json
    type: move
    from: attributes.parsed_json.timestamp
    to: attributes.timestamp_extracted
  
  # Parse log level
  - id: extract-level-json
    type: move
    from: attributes.parsed_json.level
    to: attributes.log_level
  
  # Set severity
  - id: severity-parser
    type: severity_parser
    parse_from: attributes.log_level

Docker Stats Receiver

Collects metrics from Docker containers:
docker_stats:
  endpoint: "unix:///var/run/docker.sock"
  collection_interval: 60s
  timeout: 10s
Metrics collected:
  • Container CPU usage
  • Memory usage and limits
  • Network I/O
  • Block I/O

Processors

Batch Processor

Batches telemetry for efficient export:
send_batch_size
integer
default:"10000"
Number of items to batch before sending.
timeout
duration
default:"10s"
Maximum time to wait before sending a partial batch.
batch:
  send_batch_size: 10000
  timeout: 10s

Resource Detection Processor

Automatically detects resource attributes:
resourcedetection:
  detectors: [env, system, docker]
  override: false
  system:
    hostname_sources: [os]
    resource_attributes:
      host.name:
        enabled: true
      host.id:
        enabled: true
      host.ip:
        enabled: true
Detectors:
  • env: Read from OTEL_RESOURCE_ATTRIBUTES
  • system: Hostname, OS, architecture
  • docker: Container ID and name

K8s Attributes Processor

Enriches telemetry with Kubernetes metadata:
k8sattributes:
  auth_type: serviceAccount
  passthrough: false
  filter:
    node_from_env_var: KM_NODE_NAME
  extract:
    metadata:
      - k8s.pod.name
      - k8s.pod.uid
      - k8s.deployment.name
      - k8s.namespace.name
      - k8s.node.name
      - k8s.container.name
  pod_association:
    - sources:
        - from: resource_attribute
          name: k8s.pod.uid
Extracted attributes:
  • Pod name, UID, IP
  • Namespace
  • Deployment/StatefulSet/DaemonSet name
  • Container name and image
  • Node name

Transform Processor

Advanced data transformation using OTTL (OpenTelemetry Transformation Language):
transform/ratecalculation/copymetric:
  error_mode: ignore
  metric_statements:
    - context: metric
      statements:
        - copy_metric(name="system.network.io.rate") where metric.name == "system.network.io"
        - copy_metric(name="system.disk.io.rate") where metric.name == "system.disk.io"

transform/ratecalculation/sumtogauge:
  error_mode: ignore
  metric_statements:
    - context: metric
      statements:
        - convert_sum_to_gauge() where metric.name == "system.network.io"
        - convert_sum_to_gauge() where metric.name == "system.disk.io"
Common transformations:
  • Convert cumulative to delta metrics
  • Calculate rates from counters
  • Copy metrics for parallel processing
  • Add/modify attributes
  • Filter data

Attributes Processor

Add, update, or delete attributes:
attributes/metrics:
  actions:
    - key: k8s.cluster.name
      value: ${env:KM_CLUSTER_NAME}
      action: insert
    - key: environment
      value: production
      action: upsert
Actions:
  • insert: Add if not exists
  • update: Modify if exists
  • upsert: Add or modify
  • delete: Remove attribute

Cumulative to Delta Processor

Converts cumulative metrics to delta:
cumulativetodelta:
  include:
    match_type: strict
    metrics:
      - system.network.io
      - system.disk.io
      - k8s.pod.network.io.rate

Delta to Rate Processor

Calculates rates from delta metrics:
deltatorate:
  metrics:
    - system.network.io
    - system.disk.io

Exporters

OTLP HTTP Exporter

Exports telemetry via OTLP/HTTP to KloudMate:
otlphttp:
  sending_queue:
    enabled: true
    num_consumers: 10      # Parallel workers
    queue_size: 10000      # Buffer size
  endpoint: ${env:KM_COLLECTOR_ENDPOINT}
  headers:
    Authorization: ${env:KM_API_KEY}
sending_queue.enabled
boolean
default:"true"
Enable persistent queue for reliability.
sending_queue.num_consumers
integer
default:"10"
Number of parallel export workers.
sending_queue.queue_size
integer
default:"10000"
Maximum items to buffer before blocking.

Debug Exporter

Outputs telemetry to logs for troubleshooting:
debug:
  verbosity: basic  # basic, normal, detailed
Verbosity levels:
  • basic: Count of items exported
  • normal: Summary of items
  • detailed: Full item contents

Extensions

Health Check

Provides health check endpoint:
extensions:
  health_check:
    endpoint: 0.0.0.0:13133

service:
  extensions: [health_check]
Endpoints:
  • http://localhost:13133/ - Health check

Custom Configuration

To customize the collector configuration:
1

Export Current Configuration

# Linux
sudo cp /etc/kmagent/config.yaml /etc/kmagent/config.yaml.backup

# Kubernetes
kubectl get configmap km-agent-configmap-daemonset -n km-agent -o yaml > config-backup.yaml
2

Modify Configuration

Edit the YAML file to add/modify components:
receivers:
  # Add custom receiver
  prometheus:
    config:
      scrape_configs:
        - job_name: 'my-app'
          static_configs:
            - targets: ['localhost:9090']

processors:
  # Add custom processor
  filter:
    metrics:
      exclude:
        match_type: regexp
        metric_names:
          - ^test\..*

service:
  pipelines:
    metrics:
      receivers: [hostmetrics, prometheus]  # Add prometheus
      processors: [filter, batch]           # Add filter
3

Apply Configuration

# Linux: Restart service
sudo systemctl restart kmagent

# Docker: Recreate container with new config mount
docker restart kmagent

# Kubernetes: Update ConfigMap
kubectl apply -f config-backup.yaml
kubectl rollout restart daemonset km-agent -n km-agent
4

Verify Configuration

# Check logs
kmagent logs

# Verify metrics
curl http://localhost:13133/
Custom configurations may be overwritten by remote configuration updates. To preserve custom settings, disable remote updates by setting KM_CONFIG_CHECK_INTERVAL=0.

Next Steps

Environment Variables

Reference for all configuration variables

Remote Configuration

Manage configuration dynamically

Troubleshooting

Resolve collector issues

OpenTelemetry Docs

Official collector documentation

Build docs developers (and LLMs) love