Overview
OpenTelemetry is a vendor-neutral observability framework that provides:- Unified Collection: Single endpoint for all telemetry data
- Data Processing: Transform, filter, and enrich observability data
- Multiple Exporters: Send data to various backends simultaneously
- Standards-Based: Industry-standard OTLP protocol support
- Extensible: Rich ecosystem of receivers, processors, and exporters
Architecture
The OpenTelemetry Collector acts as the central data pipeline:Setup
Configure OpenTelemetry Endpoint
Enable OTel in your
.env file:.env
PentAGI will automatically send telemetry to the OTel collector when
OTEL_HOST is set.Configuration
The OTel Collector is configured via/observability/otel/config.yml:
Receivers
Data collection endpoints:config.yml
Processors
Data transformation and filtering:config.yml
Exporters
Data output destinations:config.yml
Pipelines
Data flow configuration:config.yml
Telemetry Types
Traces
Distributed tracing data: Source: PentAGI application spans Flow:PentAGI → OTel → Jaeger
Usage: Track request flow through system
Metrics
Numerical measurements over time: Sources:- PentAGI application metrics (OTLP)
- Node Exporter (system metrics)
- cAdvisor (container metrics)
- Component health checks
Sources → OTel → VictoriaMetrics
Usage: Monitor performance and resource usage
Logs
Structured log events: Source: PentAGI application logs Flow:PentAGI → OTel → Loki
Usage: Debug issues and audit operations
Integration
PentAGI Integration
PentAGI automatically sends telemetry when configured:.env
- Create spans for agent operations
- Export application metrics
- Send structured logs
- Include trace context in all operations
Langfuse Integration
Connect Langfuse to OTel for unified observability:.env
- LLM traces in Jaeger
- Langfuse metrics in Grafana
- Unified log aggregation
Monitoring
Collector Health
Built-in health endpoints:Performance Metrics
Key metrics to monitor:| Metric | Description |
|---|---|
otelcol_receiver_accepted_spans | Spans received |
otelcol_receiver_refused_spans | Spans rejected |
otelcol_exporter_sent_spans | Spans exported |
otelcol_processor_batch_batch_send_size | Batch sizes |
otelcol_processor_batch_timeout_trigger | Batch timeouts |
Resource Usage
Monitor collector resource consumption:Troubleshooting
No Data Flowing
Verify collector is receiving data:Connection Refused
Check network connectivity:High Memory Usage
Optimize collector configuration:config.yml
Export Failures
Debug exporter issues:Advanced Configuration
Sampling
Reduce trace volume:config.yml
Filtering
Drop unwanted data:config.yml
Enrichment
Add context to telemetry:config.yml
Multiple Backends
Export to multiple destinations:config.yml
Best Practices
Configuration Management
- Version control your
config.yml - Use environment variables for secrets
- Document custom configuration changes
- Test changes in development first
- Keep backups of working configurations
Performance Optimization
- Enable batching for all pipelines
- Use appropriate batch sizes (100-1000)
- Configure memory limiters
- Monitor collector resource usage
- Scale horizontally if needed
Security
- Use TLS for production deployments
- Restrict network access to OTel ports
- Sanitize sensitive data in processors
- Implement authentication on receivers
- Audit configuration regularly
Reliability
- Configure retry policies for exporters
- Use persistent queues for critical data
- Monitor collector health continuously
- Set up redundant collectors
- Test failover scenarios
Related Documentation
- Grafana - Visualization and dashboards
- Langfuse - LLM observability integration
- Observability Guide - Complete setup guide