Overview
KloudMate Agent is an OpenTelemetry Collector distribution that extends the upstream collector with automated deployment, remote configuration management, and lifecycle orchestration. The agent architecture separates concerns between agent management and collector execution, enabling dynamic configuration updates without manual intervention.
Core Design Principles
The KloudMate Agent architecture is built on several key principles:Separation of Concerns
The Agent manages lifecycle and configuration while the Collector handles telemetry processing
Remote Configuration
Configuration updates are pulled from remote APIs, eliminating the need for SSH access
Graceful Lifecycle
Atomic configuration updates with zero-downtime collector restarts
Multi-Platform
Unified architecture supports Linux, Docker, Kubernetes, and Windows deployments
System Components
The KloudMate Agent consists of several interconnected components:Agent Layer
The agent layer is responsible for:- Lifecycle Management: Starting, stopping, and restarting the OpenTelemetry Collector
- Configuration Watching: Periodically checking for configuration updates from remote APIs
- State Tracking: Monitoring agent and collector status for health reporting
- Service Integration: Running as a system service (systemd, Docker, Kubernetes)
internal/agent/agent.go
Collector Layer
The collector layer uses the upstream OpenTelemetry Collector with a curated set of components:- Receivers: Collect telemetry data from various sources (host metrics, logs, traces)
- Processors: Transform, filter, and enrich telemetry data
- Exporters: Send telemetry to backends (OTLP endpoints)
- Extensions: Provide additional functionality (health checks, pprof)
Configuration Updater
The updater component handles remote configuration synchronization:- Polls remote API at configurable intervals (default: 60 seconds)
- Compares local and remote configurations
- Triggers collector restart when configuration changes
- Reports agent and collector status to the API
internal/updater/updater.go
Deployment Modes
KloudMate Agent supports multiple deployment modes, each optimized for specific environments:- Host Agent
- Docker Agent
- Kubernetes Agent
Runs as a system service on Linux or Windows hosts. Collects host-level metrics, logs, and traces.Use Cases:
- Bare metal servers
- Virtual machines
- Traditional infrastructure
Communication Flow
Agent Initialization
The agent starts and loads its configuration from environment variables, config files, or CLI flags.
cmd/kmagent/main.go
Collector Startup
The agent creates and starts an OpenTelemetry Collector instance with the current configuration.
internal/agent/agent.go
Configuration Watching
The agent periodically polls the remote API for configuration updates, sending status information.
internal/agent/agent.go
State Management
The agent maintains state using atomic operations and mutexes to ensure thread-safe access:Configuration Sources
The agent supports multiple configuration sources with a priority hierarchy:Environment Variables
Environment Variables
Highest priority. Used for runtime configuration:
KM_API_KEY: Authentication key for remote endpointsKM_COLLECTOR_ENDPOINT: OpenTelemetry exporter endpointKM_CONFIG_CHECK_INTERVAL: Interval for configuration pollingKM_UPDATE_ENDPOINT: Remote configuration API endpoint
CLI Flags
CLI Flags
Command-line arguments override defaults:
--config: Path to collector configuration file--api-key: API key for authentication--collector-endpoint: Exporter endpoint--config-check-interval: Update check interval
Configuration File
Configuration File
YAML configuration file (platform-specific paths):
- Linux:
/etc/kmagent/config.yaml - Windows:
<executable-dir>/config.yaml - Docker:
/etc/kmagent/config.yaml
Remote API
Remote API
Configuration pulled from KloudMate API:
- Dynamic updates without restarts
- Centralized configuration management
- Version-specific configurations
Security Considerations
Observability
The agent provides multiple observability mechanisms:- Structured Logging: JSON-formatted logs with configurable levels
- Status Reporting: Agent and collector status sent to remote API
- Error Tracking: Collector errors captured and reported
- Health Checks: Built-in health check extensions in the collector
internal/agent/agent.go
Next Steps
Host Agent
Learn about the host agent architecture and lifecycle management
Kubernetes Agent
Understand the Kubernetes agent deployment model
Collector Lifecycle
Deep dive into collector lifecycle and restart mechanisms
Configuration
Configure your agent for different scenarios