Skip to main content

Log Management with journalctl

All PROPPR services log to systemd journal with dedicated identifiers.

View Service Logs

# Follow logs for a specific service
journalctl -u team-bot -f

# Follow multiple services
journalctl -u team-bot -u player-bot -u ev-bot -f

# Follow all PROPPR services
journalctl -u 'proppr-*' -f

Log Output Formats

# Short format (default)
journalctl -u team-bot -o short

# With ISO timestamps
journalctl -u player-bot -o short-iso

# JSON format for parsing
journalctl -u ev-bot -o json

# JSON format (one line per entry)
journalctl -u arb-bot -o json-pretty

# Verbose (all fields)
journalctl -u websocket-updater -o verbose

Export Logs

# Export to file
journalctl -u team-bot --since today > team-bot-today.log

# Export multiple services
journalctl -u team-bot -u player-bot --since "2 days ago" > bots.log

# Export as JSON
journalctl -u ev-bot -o json > ev-bot.json

Service Health Checks

Check Service Status

# Check if service is running
systemctl is-active team-bot

# Check if service is enabled
systemctl is-enabled team-bot

# Full status with recent logs
systemctl status team-bot

Automated Health Check Script

Create a health check script at /opt/proppr/scripts/health_check.sh:
health_check.sh
#!/bin/bash
# PROPPR Health Check Script

SERVICES=(
    "team-bot"
    "player-bot"
    "ev-bot"
    "arb-bot"
    "horse-bot"
    "overtime-bot"
    "websocket-updater"
    "unified-poller"
    "team-grader"
    "player-grader"
)

echo "========================================"
echo "PROPPR Service Health Check"
echo "Time: $(date)"
echo "========================================"
echo ""

for service in "${SERVICES[@]}"; do
    if systemctl is-active --quiet "$service"; then
        status="RUNNING"
        uptime=$(systemctl show "$service" --property=ActiveEnterTimestamp | cut -d'=' -f2)
        restarts=$(systemctl show "$service" --property=NRestarts | cut -d'=' -f2)
        echo "[✓] $service: $status (Uptime: $uptime, Restarts: $restarts)"
    else
        echo "[✗] $service: STOPPED"
    fi
done

echo ""
echo "MongoDB Status:"
systemctl is-active --quiet mongod && echo "[✓] MongoDB: RUNNING" || echo "[✗] MongoDB: STOPPED"

echo ""
echo "Disk Usage:"
df -h /opt/proppr | tail -n 1

echo ""
echo "Memory Usage:"
free -h | grep Mem
Run the health check:
chmod +x /opt/proppr/scripts/health_check.sh
/opt/proppr/scripts/health_check.sh

Performance Monitoring

CPU and Memory Usage

# Show all PROPPR processes
ps aux | grep python | grep PROPPR

# Show with memory usage
ps aux --sort=-%mem | grep PROPPR | head -10

# Show with CPU usage
ps aux --sort=-%cpu | grep PROPPR | head -10

MongoDB Monitoring

# Check MongoDB logs
sudo tail -f /var/log/mongodb/mongod.log

# MongoDB stats
mongo proppr --eval "db.stats()"

# Check collection sizes
mongo proppr --eval "db.getCollectionNames().forEach(function(c) { 
    var stats = db.getCollection(c).stats(); 
    print(c + ': ' + stats.count + ' documents, ' + (stats.size/1024/1024).toFixed(2) + ' MB'); 
})"

Network Monitoring

# Monitor active connections
netstat -tupn | grep python

# Monitor websocket connections
netstat -tupn | grep :443 | grep ESTABLISHED

# Monitor API requests (requires tcpdump)
sudo tcpdump -i any -n host api.odds-api.io

Alert Detection

Monitor for Errors

Create a script to watch for errors and send alerts:
error_monitor.sh
#!/bin/bash
# Monitor PROPPR services for errors

SERVICE="$1"
ERROR_THRESHOLD=10
CHECK_MINUTES=5

if [ -z "$SERVICE" ]; then
    echo "Usage: $0 <service-name>"
    exit 1
fi

ERROR_COUNT=$(journalctl -u "$SERVICE" --since "${CHECK_MINUTES} minutes ago" | grep -i error | wc -l)

if [ "$ERROR_COUNT" -gt "$ERROR_THRESHOLD" ]; then
    echo "WARNING: $SERVICE has $ERROR_COUNT errors in last $CHECK_MINUTES minutes"
    echo "Recent errors:"
    journalctl -u "$SERVICE" --since "${CHECK_MINUTES} minutes ago" | grep -i error | tail -5
    
    # Send alert (integrate with your notification system)
    # curl -X POST "https://api.telegram.org/bot<token>/sendMessage" \
    #      -d "chat_id=<chat_id>&text=PROPPR Alert: $SERVICE has $ERROR_COUNT errors"
fi

Set Up Cron Jobs for Monitoring

Add to /etc/crontab:
# Run health check every 5 minutes
*/5 * * * * root /opt/proppr/scripts/health_check.sh >> /var/log/proppr-health.log 2>&1

# Monitor for errors every 10 minutes
*/10 * * * * root /opt/proppr/scripts/error_monitor.sh team-bot
*/10 * * * * root /opt/proppr/scripts/error_monitor.sh player-bot
*/10 * * * * root /opt/proppr/scripts/error_monitor.sh websocket-updater

# Daily log rotation
0 0 * * * root journalctl --vacuum-time=30d

Manual Pipeline Execution

For services that need manual triggering (like StatsUpdateFM):
# Run stats update pipeline
ssh [email protected] "cd /opt/proppr/StatsUpdateFM && \
    export PYTHONPATH=/opt && \
    python3.11 runners/run_stats_pipeline.py"
Or use the deployment script:
./scripts/deploy/run_stats_update_now_fm.sh

Log Retention

Configure Journal Size

Edit /etc/systemd/journald.conf:
/etc/systemd/journald.conf
[Journal]
SystemMaxUse=2G
SystemKeepFree=500M
SystemMaxFileSize=100M
MaxRetentionSec=30day
Restart journald:
sudo systemctl restart systemd-journald

Manual Log Cleanup

# Remove logs older than 30 days
sudo journalctl --vacuum-time=30d

# Remove logs larger than 1GB
sudo journalctl --vacuum-size=1G

# Keep only last 100 entries
sudo journalctl --vacuum-files=100

Troubleshooting Common Issues

Verify the service file has:
StandardOutput=journal
StandardError=journal
SyslogIdentifier=proppr-<service>
Reload and restart:
sudo systemctl daemon-reload
sudo systemctl restart team-bot
Check if journald is running:
systemctl status systemd-journald
Check journal disk usage:
journalctl --disk-usage
Check which service is consuming memory:
ps aux --sort=-%mem | grep PROPPR | head -10
Restart the service to clear memory:
sudo systemctl restart <service-name>
Check for crashes in logs:
journalctl -u team-bot | grep -i "restart\|crash\|killed"
Check system resources:
free -h
df -h

Remote Monitoring

SSH Monitoring Commands

# Check all services remotely
ssh [email protected] "systemctl status 'team-bot' 'player-bot' 'ev-bot'"

# View logs remotely
ssh [email protected] "journalctl -u team-bot -n 100"

# Run health check remotely
ssh [email protected] "/opt/proppr/scripts/health_check.sh"

Create Monitoring Aliases

Add to ~/.bashrc or ~/.zshrc:
alias proppr-status='ssh [email protected] "systemctl status team-bot player-bot ev-bot arb-bot"'
alias proppr-logs='ssh [email protected] "journalctl -u team-bot -f"'
alias proppr-health='ssh [email protected] "/opt/proppr/scripts/health_check.sh"'
alias proppr-restart='ssh [email protected] "systemctl restart team-bot player-bot ev-bot arb-bot"'

Next Steps

Service Configuration

Review systemd service setup

Production Setup

Review production deployment guide

Build docs developers (and LLMs) love