Overview
The monitor node connects to remote servers via SSH and executes health check commands for each configured service. It maintains a real-time snapshot of service states and immediately detects failures.The monitor node runs as the first step in the agent’s workflow loop, implementing continuous monitoring until a failure is detected.
Core Functionality
SSH Connection Establishment
Establishes an SSH connection using credentials from the configuration:
Service Health Checks
Iterates through all configured services and executes their health check commands:
State Management
The monitor node updates the agent state with critical information:Success State
When all services are healthy:
current_step: “monitor”current_error: Noneaffected_service: None
Failure State
When a service failure is detected:
current_step: “monitor”current_error: Error descriptionaffected_service: Failed service name
Service Configuration
Each monitored service requires configuration with these properties:The
running_indicator string is used to determine if the service is active by checking if it appears in the command output.Error Handling
The monitor node implements robust error handling for SSH connection failures:Event Logging
The monitor node emits structured events for observability:status_update
Emitted with complete service snapshot for UI updates
monitor
Logs operational status messages and failure detection
Implementation Location
Source:src/agent/nodes/monitor.py:18