Services Configuration

Sentinel AI monitors and manages services defined in services.json. The configuration supports multiple service types including web servers, databases, and system services.

Overview

Services are configured in data/services.json with check commands and running indicators. Sentinel AI periodically executes these commands to verify service health.

Default Monitor Interval: 30 seconds (configured via MONITOR_INTERVAL in config.py)

Service Configuration File

The services file is located at data/services.json and is automatically loaded on startup:

DATA_DIR = "data"
SERVICES_FILE = os.path.join(DATA_DIR, "services.json")

def load_services(self):
    if os.path.exists(self.SERVICES_FILE):
        try:
            with open(self.SERVICES_FILE, "r") as f:
                return json.load(f)
        except Exception:
            return self.DEFAULT_SERVICES.copy()
    return self.DEFAULT_SERVICES.copy()

Service Schema

Each service requires three properties:

check_command

string

required

Shell command to check service status. Executed via SSH on the target server.Examples:

service nginx status
systemctl status postgresql
docker ps | grep my-container
/usr/local/bin/check_custom_service.sh

running_indicator

string

required

String pattern to search for in command output that indicates the service is running.Examples:

is running
active (running)
online
Status: healthy

The check is case-sensitive and uses substring matching. If this string is found in the command output, the service is considered healthy.

type

string

required

Service category for organization and reporting.Common types:

web_server - Nginx, Apache, Caddy
database - PostgreSQL, MySQL, MongoDB
system - SSH, cron, systemd services
application - Custom applications
container - Docker containers
cache - Redis, Memcached

Default Services

Sentinel AI includes three default services defined in src/core/config.py:

DEFAULT_SERVICES = {
    "nginx": {
        "check_command": "service nginx status",
        "running_indicator": "is running",
        "type": "web_server"
    },
    "postgresql": {
        "check_command": "service postgresql status",
        "running_indicator": "online",
        "type": "database"
    },
    "ssh": {
        "check_command": "service ssh status",
        "running_indicator": "is running",
        "type": "system"
    }
}

These defaults are used if data/services.json doesn’t exist or fails to load. They serve as a template for adding your own services.

Managing Services

Adding Services

Use the add_service() method to add new services:

from src.core.config import config

# Add a Redis service
config.add_service(
    name="redis",
    check_cmd="service redis-server status",
    indicator="is running",
    service_type="cache"
)

# Add a Docker container
config.add_service(
    name="api-container",
    check_cmd="docker inspect -f '{{.State.Status}}' api-container",
    indicator="running",
    service_type="container"
)

# Add a custom application
config.add_service(
    name="myapp",
    check_cmd="systemctl status myapp.service",
    indicator="active (running)",
    service_type="application"
)

The add_service() method automatically saves changes to data/services.json.

Removing Services

Remove services using the remove_service() method:

from src.core.config import config

# Remove a service
config.remove_service("redis")

# Verify removal
print("redis" in config.SERVICES)  # False

Updating Services

To update a service, remove and re-add it:

from src.core.config import config

# Update nginx configuration
config.remove_service("nginx")
config.add_service(
    name="nginx",
    check_cmd="systemctl status nginx",  # Changed to systemctl
    indicator="active (running)",        # Updated indicator
    service_type="web_server"
)

Service Examples

Web Servers

{
  "nginx": {
    "check_command": "service nginx status",
    "running_indicator": "is running",
    "type": "web_server"
  }
}

Databases

{
  "postgresql": {
    "check_command": "service postgresql status",
    "running_indicator": "online",
    "type": "database"
  }
}

Docker Containers

{
  "api-container": {
    "check_command": "docker inspect -f '{{.State.Status}}' api-container",
    "running_indicator": "running",
    "type": "container"
  }
}

Custom Applications

{
  "myapp": {
    "check_command": "systemctl status myapp.service",
    "running_indicator": "active (running)",
    "type": "application"
  }
}

Service Types

Organize services by type for better reporting and management:

Web Servers

Nginx, Apache, Caddy, Traefik

Databases

PostgreSQL, MySQL, MongoDB, Redis

System Services

SSH, cron, systemd, networking

Containers

Docker containers and services

Applications

Custom applications and APIs

Cache Services

Redis, Memcached, Varnish

Health Check Patterns

Service Command

Traditional service status check:

service nginx status
service postgresql status
service redis-server status

Look for: is running, running, active

Systemctl Command

Systemd service manager:

systemctl status nginx
systemctl is-active postgresql
systemctl show -p SubState nginx

Look for: active (running), active, SubState=running

Docker Commands

Container status checks:

docker ps | grep container-name
docker inspect -f '{{.State.Status}}' container-name
docker inspect -f '{{.State.Health.Status}}' container-name

Look for: Up, running, healthy

Custom Scripts

Use custom health check scripts:

/usr/local/bin/check_app_health.sh
/opt/monitoring/service_status.py
curl -sf http://localhost:8080/health || exit 1

Look for: Custom output patterns

Best Practices

Use Specific Indicators

Choose unique running indicators that won’t appear in error messages. Use active (running) instead of just active.

Test Commands Manually

Test each check command via SSH before adding to services.json to ensure it works correctly.

Keep Commands Fast

Avoid slow commands that could delay monitoring. Aim for sub-second execution time.

Handle Edge Cases

Consider services that might be stopped intentionally. Use service types to group related checks.

Document Dependencies

Note service dependencies in your services.json comments (if using JSONC) or separate docs.

Version Control

Keep services.json in version control to track configuration changes over time.

Advanced Configuration

Multi-Host Services

Monitor the same service across multiple hosts:

# Define services per host
HOST_SERVICES = {
    "prod-server-1": {
        "nginx": {...},
        "postgresql": {...}
    },
    "prod-server-2": {
        "nginx": {...},
        "api-app": {...}
    }
}

Service Dependencies

Track service dependencies for intelligent recovery:

SERVICE_DEPENDENCIES = {
    "api-app": ["postgresql", "redis"],
    "nginx": ["api-app"],
}

# Start dependencies first during recovery
def recover_service(service_name):
    deps = SERVICE_DEPENDENCIES.get(service_name, [])
    for dep in deps:
        ensure_service_running(dep)
    start_service(service_name)

Custom Health Checks

Implement application-specific health checks:

import requests

def check_api_health():
    """Custom health check for API service."""
    try:
        response = requests.get("http://localhost:8080/health", timeout=5)
        if response.status_code == 200:
            data = response.json()
            return data.get("status") == "healthy"
    except Exception:
        return False
    return False

# Register custom check
CUSTOM_CHECKS = {
    "api-service": check_api_health
}

Monitoring and Alerts

Sentinel AI monitors services at regular intervals (default: 30 seconds):

MONITOR_INTERVAL = 30  # seconds
MAX_RETRIES = 5

Monitoring Behavior

Status Check: Execute check_command via SSH
Pattern Match: Search output for running_indicator
Health Decision: Service is healthy if indicator is found
Recovery: If unhealthy, attempt automated remediation
Retry Logic: Up to MAX_RETRIES attempts with exponential backoff

Failed health checks trigger the agent’s diagnostic and recovery workflow. Ensure your check commands are reliable to avoid false positives.

Troubleshooting

Service Always Shows as Down

Problem: Service health check always fails even when service is running.Solutions:

Test the check command manually via SSH: ssh user@host 'service nginx status'
Verify the running_indicator string appears in the output
Check that the SSH user has permission to run the check command
Look for case sensitivity issues in the indicator string

Services Not Loading

Problem: Services from services.json are not being monitored.Solutions:

Verify data/services.json exists and is valid JSON
Check file permissions: ls -la data/services.json
Review startup logs for JSON parsing errors
Test JSON validity: python -m json.tool data/services.json

Permission Denied During Checks

Problem: Health checks fail with “Permission denied” errors.Solutions:

Configure sudoers to allow check commands without password
Add SSH user to appropriate groups (docker, systemd-journal)
See SSH Setup for details

Slow Health Checks

Problem: Monitoring interval is delayed due to slow check commands.Solutions:

Use faster status check methods (e.g., systemctl is-active instead of systemctl status)
Avoid commands that require DNS lookups or network calls
Consider increasing MONITOR_INTERVAL if checks can’t be optimized
Use timeout flags: timeout 5 service nginx status

API Reference

Configuration Methods

from src.core.config import config

# Load services from file
services = config.load_services()

# Save services to file
config.save_services()

# Add a new service
config.add_service(
    name="myservice",
    check_cmd="systemctl status myservice",
    indicator="active (running)",
    service_type="application"
)

# Remove a service
config.remove_service("myservice")

# Access services dictionary
all_services = config.SERVICES
if "nginx" in config.SERVICES:
    nginx_config = config.SERVICES["nginx"]
    print(f"Check: {nginx_config['check_command']}")

Next Steps

SSH Setup

Configure SSH authentication and permissions

AI Models

Customize AI model settings and behavior

Deployment

Deploy Sentinel AI with Docker

Monitoring

Learn about service monitoring features

Get Started

Core Concepts

Configuration

Agent Operations

Dashboard

Advanced

​Overview

​Service Configuration File

​Service Schema

​Default Services

​Managing Services

​Adding Services

​Removing Services

​Updating Services

​Service Examples

​Web Servers

​Databases

​Docker Containers

​Custom Applications

​Service Types

Web Servers

Databases

System Services

Containers

Applications

Cache Services

​Health Check Patterns

​Service Command

​Systemctl Command

​Docker Commands

​Custom Scripts

​Best Practices

Use Specific Indicators

Test Commands Manually

Keep Commands Fast

Handle Edge Cases

Document Dependencies

Version Control

​Advanced Configuration

​Multi-Host Services

​Service Dependencies

​Custom Health Checks

​Monitoring and Alerts

​Monitoring Behavior

​Troubleshooting

​API Reference

​Configuration Methods

​Next Steps

SSH Setup

AI Models

Deployment

Monitoring

Build docs developers (and LLMs) love

Overview

Service Configuration File

Service Schema

Default Services

Managing Services

Adding Services

Removing Services

Updating Services

Service Examples

Web Servers

Databases

Docker Containers

Custom Applications

Service Types

Health Check Patterns

Service Command

Systemctl Command

Docker Commands

Custom Scripts

Best Practices

Advanced Configuration

Multi-Host Services

Service Dependencies

Custom Health Checks

Monitoring and Alerts

Monitoring Behavior

Troubleshooting

API Reference

Configuration Methods

Next Steps