Skip to main content

Overview

Effective monitoring ensures your Ubu-Block nodes operate reliably and helps detect issues before they impact operations. This guide covers monitoring strategies for different aspects of node operations.

Blockchain Validation

Manual Validation

Regularly validate blockchain integrity:
cargo run -- validate
Expected output for valid blockchain:
INFO ubu_block] Blockchain is valid!
Expected output for corrupted blockchain:
thread 'main' panicked at 'Could not verify block, found 0e70cebe..., block: Block {...}'

Automated Validation

Schedule regular validation checks using cron:
# Edit crontab
crontab -e

# Add validation every hour
0 * * * * cd /opt/ubu-block && /usr/local/bin/ubu-block validate >> /var/log/ubu-block/validation.log 2>&1
Validation time increases with blockchain size. For large chains, consider running validation less frequently (e.g., daily).

Node Health Monitoring

Process Monitoring

Monitor that node processes are running:
# Check if submission node is running
pgrep -f submission || echo "Submission node is not running!"

# Check process details
ps aux | grep submission

Service Status (systemd)

If using systemd services:
# Check service status
sudo systemctl status ubu-submission

# Check if service is active
systemctl is-active ubu-submission

# Check service uptime
systemctl show ubu-submission --property=ActiveEnterTimestamp

Port Monitoring

Verify that nodes are listening on expected ports:
# Check P2P port (9090)
netstat -tuln | grep 9090

# Check HTTP API port (9091)
netstat -tuln | grep 9091

# Alternative using ss
ss -tuln | grep -E '9090|9091'

API Health Checks

HTTP Endpoint Monitoring

Submission nodes expose HTTP APIs that can be monitored:
# Basic health check
curl -f http://localhost:9091/api/v1/health || echo "API is down!"

# Check with timeout
curl --max-time 5 http://localhost:9091/api/v1/health

# Get response time
curl -o /dev/null -s -w "Response time: %{time_total}s\n" http://localhost:9091/api/v1/health

Automated Health Checks

Create a monitoring script:
/usr/local/bin/check-ubu-health.sh
#!/bin/bash

API_URL="http://localhost:9091/api/v1/health"
MAX_RESPONSE_TIME=5

response=$(curl --max-time $MAX_RESPONSE_TIME -s -o /dev/null -w "%{http_code}" $API_URL)

if [ "$response" = "200" ]; then
    echo "[$(date)] API health check: OK"
    exit 0
else
    echo "[$(date)] API health check: FAILED (HTTP $response)"
    # Send alert (email, Slack, etc.)
    exit 1
fi
Run via cron:
*/5 * * * * /usr/local/bin/check-ubu-health.sh >> /var/log/ubu-block/health.log 2>&1

Database Monitoring

Database Size

Track database growth over time:
# Check current database sizes
du -h data/blockchain.db data/private.db

# Monitor growth
watch -n 60 'du -h data/*.db'

Database Integrity

Verify SQLite database integrity:
# Check blockchain database
sqlite3 data/blockchain.db "PRAGMA integrity_check;"

# Check private database
sqlite3 data/private.db "PRAGMA integrity_check;"
Expected output:
ok

Database Metrics

Query useful metrics from the blockchain:
# Count total blocks
sqlite3 data/blockchain.db "SELECT COUNT(*) FROM blocks;"

# Count total results
sqlite3 data/blockchain.db "SELECT COUNT(*) FROM results;"

# Check latest block timestamp
sqlite3 data/blockchain.db "SELECT MAX(timestamp) FROM blocks;"

Resource Monitoring

CPU and Memory Usage

Monitor resource consumption:
# Real-time monitoring
top -p $(pgrep submission)

# Memory usage
ps aux | grep submission | awk '{print $4"%", $6/1024"MB"}'

# CPU usage over time
pidstat -p $(pgrep submission) 5

Disk Space

Ensure adequate disk space:
# Check available disk space
df -h /opt/ubu-block/data

# Set alert threshold
if [ $(df /opt/ubu-block/data | awk 'NR==2 {print $5}' | sed 's/%//') -gt 80 ]; then
    echo "Warning: Disk usage above 80%"
fi

Network Monitoring

Monitor network connections and bandwidth:
# Active connections
netstat -an | grep :9090 | wc -l

# Connection details
ss -tn | grep :9090

# Network bandwidth (requires iftop)
sudo iftop -i eth0 -f "port 9090 or port 9091"

Log Monitoring

Application Logs

Monitor application logs for errors and warnings:
# Follow systemd logs
sudo journalctl -u ubu-submission -f

# Filter for errors
sudo journalctl -u ubu-submission -p err -f

# Show logs from last hour
sudo journalctl -u ubu-submission --since "1 hour ago"

Log Analysis

Analyze logs for patterns:
# Count error occurrences
sudo journalctl -u ubu-submission --since today | grep -c ERROR

# Find panic messages
sudo journalctl -u ubu-submission | grep -i panic

# Extract connection errors
sudo journalctl -u ubu-submission | grep "connection" | grep -i error

Log Rotation

Ensure logs don’t fill disk space:
# Check systemd journal size
sudo journalctl --disk-usage

# Limit journal size
sudo journalctl --vacuum-size=500M

# Limit journal age
sudo journalctl --vacuum-time=30d

Peer Connection Monitoring

Active Peer Connections

Monitor P2P connections:
# List active P2P connections
ss -tn sport = :9090 or dport = :9090

# Count active peers
ss -tn sport = :9090 or dport = :9090 | grep ESTAB | wc -l

Peer Synchronization

Check that nodes stay synchronized:
# Compare block counts between nodes
echo "Node 1:"
sqlite3 data/blockchain.db "SELECT COUNT(*) FROM blocks;"

echo "Node 2:"
sqlite3 data/blockchain2.db "SELECT COUNT(*) FROM blocks;"
Block count mismatches indicate synchronization issues. Investigate immediately.

Alerting

Email Alerts

Send email alerts on critical events:
/usr/local/bin/alert-email.sh
#!/bin/bash

SUBJECT="$1"
MESSAGE="$2"
RECIPIENT="[email protected]"

echo "$MESSAGE" | mail -s "[Ubu-Block Alert] $SUBJECT" "$RECIPIENT"

Slack Notifications

Integrate with Slack for real-time alerts:
/usr/local/bin/alert-slack.sh
#!/bin/bash

WEBHOOK_URL="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
MESSAGE="$1"

curl -X POST -H 'Content-type: application/json' \
  --data "{\"text\":\"⚠️ Ubu-Block Alert: $MESSAGE\"}" \
  "$WEBHOOK_URL"

Monitoring Services Integration

Expose metrics for Prometheus:
# Example metrics endpoint (would need to be implemented)
curl http://localhost:9091/metrics
Add to prometheus.yml:
scrape_configs:
  - job_name: 'ubu-block'
    static_configs:
      - targets: ['localhost:9091']
Create dashboards to visualize:
  • Block production rate
  • API response times
  • Resource usage (CPU, memory, disk)
  • Peer connection count
  • Error rates
Use services like:
  • UptimeRobot
  • Pingdom
  • StatusCake
Configure to monitor:

Performance Metrics

Query Performance

Monitor query execution times:
# Enable SQLite query timing
sqlite3 data/blockchain.db
.timer on
SELECT COUNT(*) FROM results;

Block Processing Time

Track time to add and validate blocks:
# Add a block and measure time
time cargo run -- insert --station 022113056303301 --candidate 1 --votes 66

# Validate and measure time
time cargo run -- validate

API Latency

Measure API endpoint latency:
# Measure response time
curl -w "@curl-format.txt" -o /dev/null -s http://localhost:9091/api/v1/results
Create curl-format.txt:
time_namelookup: %{time_namelookup}s
time_connect: %{time_connect}s
time_starttransfer: %{time_starttransfer}s
time_total: %{time_total}s

Security Monitoring

Failed Authentication Attempts

Monitor logs for suspicious activity:
# Look for authentication failures
sudo journalctl -u ubu-submission | grep -i "auth" | grep -i "fail"

# Monitor unusual traffic patterns
sudo tcpdump -i eth0 -n port 9090 or port 9091

File Integrity Monitoring

Detect unauthorized changes to critical files:
# Create checksums
sha256sum data/blockchain.db > blockchain.sha256
sha256sum data/private.db > private.sha256

# Verify later
sha256sum -c blockchain.sha256
sha256sum -c private.sha256

Monitoring Checklist

Create a monitoring checklist for regular reviews:
1

Daily checks

  • Verify node processes are running
  • Check API health endpoints
  • Review error logs
  • Verify peer connections
2

Weekly checks

  • Validate blockchain integrity
  • Review resource usage trends
  • Check disk space availability
  • Analyze performance metrics
3

Monthly checks

  • Full database integrity check
  • Review and rotate logs
  • Update monitoring scripts
  • Test alerting mechanisms

Next Steps

Maintenance

Perform routine maintenance tasks

Troubleshooting

Resolve common issues

Build docs developers (and LLMs) love