Overview
PentAGI provides enterprise-grade observability through OpenTelemetry integration, Langfuse LLM analytics, and comprehensive monitoring dashboards. Track every aspect of your AI-powered penetration testing from agent performance to system metrics.LLM Observability
Track AI agent performance with Langfuse
System Metrics
Real-time infrastructure monitoring
Distributed Tracing
End-to-end request tracing with Jaeger
Log Aggregation
Centralized logging with Loki
Architecture
Monitoring Stack
PentAGI uses a comprehensive observability stack:Components
OpenTelemetry Collector
OpenTelemetry Collector
Purpose: Unified telemetry data collection and processingCapabilities:
- Metrics collection and aggregation
- Distributed trace processing
- Log collection and forwarding
- Data enrichment and filtering
.env
Langfuse
Langfuse
Purpose: LLM observability and performance analyticsCapabilities:Access: http://localhost:4000
- Trace LLM API calls
- Monitor token usage and costs
- Analyze prompt performance
- Track agent execution
- Score generation quality
.env
VictoriaMetrics
VictoriaMetrics
Purpose: High-performance time-series metrics storageCapabilities:
- Long-term metrics retention
- Efficient storage compression
- Fast queries and aggregations
- Prometheus-compatible
- Request rates and latencies
- Resource utilization (CPU, memory)
- Agent execution times
- Tool invocation counts
- Error rates
Jaeger
Jaeger
Purpose: Distributed tracing for debuggingCapabilities:
- End-to-end request tracing
- Service dependency visualization
- Performance bottleneck identification
- Error propagation tracking
- Spans for each operation
- Parent-child relationships
- Timing information
- Contextual attributes
Loki
Loki
Purpose: Scalable log aggregationCapabilities:
- Centralized log collection
- Label-based log indexing
- Efficient log storage
- Powerful query language (LogQL)
- Application logs
- Agent execution logs
- System logs
- Docker container logs
Grafana
Grafana
Purpose: Unified monitoring dashboardsCapabilities:
- Custom dashboard creation
- Multi-source data visualization
- Alerting and notifications
- Correlation of metrics, traces, and logs
LLM Observability
Langfuse Integration
PentAGI automatically tracks all LLM interactions:Observation Types
Different observation types for different components:Observation Metadata
Rich metadata for filtering and analysis:LLM Metrics Tracked
Token Usage
- Input tokens per request
- Output tokens per request
- Total tokens per session
- Cost estimation
Latency
- First token latency (TTFT)
- Total generation time
- API call duration
- Queue wait time
Quality Metrics
- Generation scores
- Hallucination detection
- Output relevance
- Task completion rate
Error Tracking
- API failures
- Rate limit errors
- Timeout occurrences
- Invalid responses
Langfuse Dashboard Views
- Traces: Complete agent execution traces
- Sessions: User session analytics
- Generations: Individual LLM generations
- Scores: Quality scoring and evaluation
- Datasets: Prompt testing and validation
- Analytics: Aggregate metrics and trends
System Metrics
Metrics Collection
Automatic metrics collection:Process Metrics
OS-level process metrics:- CPU Usage: Process CPU utilization percentage
- Memory Usage: RSS, VMS, heap allocations
- File Descriptors: Open file descriptor count
- Threads: Thread count and goroutine count
- Network: Bytes sent/received
Go Runtime Metrics
Go-specific runtime metrics:- Goroutines: Number of active goroutines
- Memory: Heap usage, GC stats, allocations
- GC: Garbage collection frequency and pause times
- Scheduler: Goroutine scheduling latency
Custom Metrics
Create custom metrics for specific monitoring:Agent Performance Metrics
Tracked automatically for all agents:Distributed Tracing
Trace Context
Automatic trace propagation:Span Hierarchy
Automatic parent-child relationships:Span Attributes
Rich span metadata:Error Tracking
Automatic error recording:Log Aggregation
Structured Logging
Automatic log enrichment:Log Levels
Supported log levels:- Trace: Very detailed debugging
- Debug: Detailed debugging information
- Info: General informational messages
- Warn: Warning messages
- Error: Error messages
- Fatal: Critical errors causing termination
Log Integration
Logs automatically correlated with traces:Query Logs
LogQL examples for common queries:Configuration
Enable Monitoring
Configure observability components:.env
Start Monitoring Stack
Observability Levels
Configure verbosity:Grafana Dashboards
Pre-built Dashboards
Included dashboard examples:-
Agent Performance
- Agent execution times
- Success/failure rates
- Tool usage distribution
-
System Health
- CPU and memory usage
- Request rates
- Error rates
-
LLM Analytics
- Token usage trends
- API latency
- Cost tracking
-
Container Metrics
- Docker container stats
- Network I/O
- Resource limits
Custom Dashboards
Create custom visualizations:- Navigate to Grafana (http://localhost:3000)
- Click “Create” → “Dashboard”
- Add panels with queries:
- VictoriaMetrics: Prometheus queries
- Loki: LogQL queries
- Jaeger: Trace queries
Best Practices
Performance Monitoring
Performance Monitoring
- Monitor agent execution times
- Track tool invocation patterns
- Set alerts for slow operations
- Review token usage regularly
- Optimize expensive operations
Error Tracking
Error Tracking
- Review error logs daily
- Set up alerts for critical errors
- Track error rate trends
- Investigate error spikes promptly
- Document common error resolutions
Cost Optimization
Cost Optimization
- Monitor LLM token usage
- Track API costs per agent
- Identify expensive prompts
- Optimize context sizes
- Use cheaper models where appropriate
Capacity Planning
Capacity Planning
- Monitor resource utilization trends
- Track concurrent flow counts
- Plan for peak usage periods
- Set resource limits appropriately
- Scale horizontally when needed
Related Resources
Autonomous Testing
Understand agent execution flow
Security Tools
Tool execution tracking
Reporting
Report generation metrics
Architecture
System architecture overview