Overview
Starting with recent versions, Cosmos SDK uses OpenTelemetry as the standard for instrumentation. The legacy go-metrics based telemetry is deprecated in favor of OpenTelemetry’s unified observability framework.OpenTelemetry Configuration
Configuration File
Telemetry is configured viaotel.yaml in your node’s config directory:
config/otel.yaml
Initialization
Telemetry is initialized automatically from the config file:telemetry/config.go
Environment Variable
Enable telemetry via environment variable for early initialization:Metrics
Module Metrics
The SDK automatically tracks module execution time:x/upgrade/abci.go
Common Metric Keys
telemetry/wrapper.go
Custom Metrics
Add custom metrics in your modules:Distributed Tracing
Creating Spans
Trace execution flow across your application:Nested Spans
Create hierarchical traces:Structured Logging
Log Levels
Use OpenTelemetry logging:Check if Logging is Enabled
telemetry/config.go
Instrumentation Extensions
Host Metrics
Monitor host system metrics:config/otel.yaml
- CPU usage
- Memory usage
- Disk I/O
- Network I/O
Runtime Metrics
Monitor Go runtime metrics:config/otel.yaml
- Goroutines
- GC stats
- Memory allocations
- Stack usage
Disk I/O Metrics
Monitor disk operations:config/otel.yaml
Exporters
OTLP Exporter
Export to OpenTelemetry Collector:Prometheus Exporter
Expose metrics for Prometheus scraping:Console Exporter
Log telemetry to console for debugging:Context Propagation
Configure propagators for distributed tracing:config/otel.yaml
Resource Attributes
Identify your service:Sampling
Control trace sampling:Legacy Telemetry (Deprecated)
The legacy go-metrics based telemetry is deprecated:telemetry/metrics.go
Shutdown
Properly shutdown telemetry:telemetry/config.go
Best Practices
- Meaningful Names: Use descriptive metric and span names
- Cardinality: Avoid high-cardinality labels (e.g., user IDs)
- Sampling: Use sampling in high-throughput environments
- Resource Attributes: Set appropriate service identification
- Error Tracking: Record errors in spans for debugging
- Performance: Be mindful of telemetry overhead
- Privacy: Don’t log sensitive data in traces/logs
Monitoring Stack Example
Docker Compose Setup
docker-compose.yml
See Also
- OpenTelemetry Documentation
- Prometheus - Metrics collection
- Jaeger - Distributed tracing
- Grafana - Visualization