Skip to main content
better-openclaw includes optional monitoring with Prometheus (metrics), Grafana (dashboards), and Loki (logs).

Enabling monitoring

Add monitoring services when generating your stack:
npx create-better-openclaw
# Select "Include monitoring stack" when prompted

Architecture

  • Prometheus: Scrapes metrics from services every 15 seconds
  • Grafana: Visualizes metrics and logs with dashboards
  • Loki: Aggregates logs from Docker containers

Prometheus configuration

Generated files

When monitoring is enabled, better-openclaw generates:
  • config/prometheus/prometheus.yml - Scrape configuration
  • docker-compose.yml - Prometheus service definition

Scrape configuration

better-openclaw automatically detects which services expose Prometheus metrics and adds them to prometheus.yml:
config/prometheus/prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s
  scrape_timeout: 10s

scrape_configs:
  # Prometheus self-monitoring
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]
        labels:
          service: "prometheus"
          component: "monitoring"

  # 🔴 Redis
  - job_name: "redis"
    metrics_path: "/metrics"
    static_configs:
      - targets: ["redis:9121"]
        labels:
          service: "redis"
          component: "database"

  # 🐘 PostgreSQL
  - job_name: "postgresql"
    metrics_path: "/metrics"
    static_configs:
      - targets: ["postgresql:9187"]
        labels:
          service: "postgresql"
          component: "database"

  # 📊 Grafana
  - job_name: "grafana"
    metrics_path: "/metrics"
    static_configs:
      - targets: ["grafana:3000"]
        labels:
          service: "grafana"
          component: "monitoring"

  # 🐾 OpenClaw Gateway
  - job_name: "openclaw-gateway"
    metrics_path: "/metrics"
    static_configs:
      - targets: ["openclaw:18789"]
        labels:
          service: "openclaw"
          component: "core"

Supported metrics endpoints

ServiceMetrics PortPathMetrics
Redis9121/metricsConnections, memory, commands
PostgreSQL9187/metricsQueries, connections, replication
MinIO9000/minio/v2/metrics/clusterStorage, bandwidth, operations
n8n5678/metricsWorkflows, executions
Grafana3000/metricsDashboards, users, queries
Caddy2019/metricsHTTP requests, response times
Traefik8082/metricsRoutes, services, middleware
Qdrant6333/metricsVectors, collections, searches
Uptime Kuma3001/metricsMonitors, uptime, response times
Ollama11434/metricsModel loads, inference, VRAM

Docker Compose

docker-compose.yml
services:
  prometheus:
    image: prom/prometheus:v2.48.0
    volumes:
      - ./config/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - prometheus-data:/prometheus
    ports:
      - "${PROMETHEUS_EXTERNAL_PORT:-9090}:9090"
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention.time=15d'
      - '--web.console.libraries=/usr/share/prometheus/console_libraries'
      - '--web.console.templates=/usr/share/prometheus/consoles'
    networks:
      - openclaw-network
    restart: unless-stopped

volumes:
  prometheus-data:

Accessing Prometheus

Open http://localhost:9090
  • Status → Targets: View all scraped services
  • Graph: Query metrics with PromQL
  • Alerts: Configure alerting rules

Grafana configuration

Generated files

  • config/grafana/provisioning/datasources/prometheus.yml - Auto-configure Prometheus datasource
  • config/grafana/provisioning/dashboards/default.yml - Dashboard provider
  • config/grafana/grafana.ini - Grafana server config
  • config/grafana/dashboards/openclaw-overview.json - Default dashboard

Datasource provisioning

config/grafana/provisioning/datasources/prometheus.yml
apiVersion: 1

datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://prometheus:9090
    isDefault: true
    editable: true
    jsonData:
      httpMethod: POST
      timeInterval: "15s"
    version: 1
Grafana automatically connects to Prometheus on startup—no manual configuration required.

Dashboard provisioning

better-openclaw includes a default OpenClaw Stack Overview dashboard:
  • Service Health: Count of healthy services
  • Memory Usage: Container memory utilization
  • Request Rate: HTTP requests per second

Docker Compose

docker-compose.yml
services:
  grafana:
    image: grafana/grafana:10.2.2
    volumes:
      - grafana-data:/var/lib/grafana
      - ./config/grafana/provisioning:/etc/grafana/provisioning:ro
      - ./config/grafana/grafana.ini:/etc/grafana/grafana.ini:ro
      - ./config/grafana/dashboards:/var/lib/grafana/dashboards:ro
    environment:
      GF_SECURITY_ADMIN_USER: ${GF_ADMIN_USER:-admin}
      GF_SECURITY_ADMIN_PASSWORD: ${GF_ADMIN_PASSWORD:-admin}
      GF_USERS_ALLOW_SIGN_UP: "false"
    ports:
      - "${GRAFANA_EXTERNAL_PORT:-3000}:3000"
    networks:
      - openclaw-network
    depends_on:
      prometheus:
        condition: service_started
    restart: unless-stopped

volumes:
  grafana-data:

Accessing Grafana

Open http://localhost:3000 Default credentials:
  • Username: admin
  • Password: admin (change on first login)
Override in .env:
.env
GF_ADMIN_USER=admin
GF_ADMIN_PASSWORD=your_secure_password

Environment variables

.env
# Grafana admin credentials
# Service: Grafana | Required: Yes | Secret: Yes
GF_ADMIN_USER=admin

# Service: Grafana | Required: Yes | Secret: Yes
GF_ADMIN_PASSWORD=

# Allow users to sign up
# Service: Grafana | Required: No | Secret: No
GF_USERS_ALLOW_SIGN_UP=false

# Grafana external port
# Service: Grafana | Required: No | Secret: No
GRAFANA_EXTERNAL_PORT=3000

Loki configuration

Loki aggregates logs from all Docker containers.

Docker Compose

docker-compose.yml
services:
  loki:
    image: grafana/loki:2.9.3
    ports:
      - "${LOKI_EXTERNAL_PORT:-3100}:3100"
    command: -config.file=/etc/loki/local-config.yaml
    networks:
      - openclaw-network
    restart: unless-stopped

  promtail:
    image: grafana/promtail:2.9.3
    volumes:
      - /var/log:/var/log:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./config/promtail/config.yml:/etc/promtail/config.yml:ro
    command: -config.file=/etc/promtail/config.yml
    networks:
      - openclaw-network
    depends_on:
      - loki
    restart: unless-stopped
Promtail ships Docker container logs to Loki.

Add Loki datasource to Grafana

  1. Open Grafana → Configuration → Data Sources
  2. Click “Add data source”
  3. Select “Loki”
  4. URL: http://loki:3100
  5. Click “Save & Test”

Query logs

In Grafana, go to Explore:
# All logs from openclaw-gateway
{container_name="openclaw-gateway"}

# Errors from all services
{job="docker"} |= "error" or "ERROR"

# PostgreSQL slow queries
{container_name="postgresql"} |~ "duration: [0-9]{4,} ms"

Custom dashboards

Creating dashboards

  1. Open Grafana
  2. Click ”+” → “Dashboard”
  3. Add panels with PromQL queries:
# Service uptime
up{job="openclaw-gateway"}

# Memory usage
container_memory_usage_bytes{name="openclaw-gateway"}

# Request rate
rate(http_requests_total[5m])

# Error rate
rate(http_requests_total{status=~"5.."}[5m])

Importing dashboards

Grafana has thousands of community dashboards:
  1. Browse at grafana.com/grafana/dashboards
  2. Find a dashboard (e.g., “Docker and system monitoring”)
  3. Copy the dashboard ID
  4. In Grafana: Dashboards → Import → Paste ID
Recommended dashboards:
  • 1860: Node Exporter Full (system metrics)
  • 893: Docker monitoring
  • 6417: PostgreSQL Database
  • 7362: Redis Dashboard
  • 11074: Traefik 2.0

Alerting

Prometheus alerting rules

Create config/prometheus/alerts.yml:
groups:
  - name: openclaw
    interval: 30s
    rules:
      - alert: ServiceDown
        expr: up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Service {{ $labels.job }} is down"
          description: "{{ $labels.job }} has been down for more than 1 minute"

      - alert: HighMemoryUsage
        expr: (container_memory_usage_bytes / container_spec_memory_limit_bytes) > 0.9
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High memory usage on {{ $labels.name }}"
          description: "{{ $labels.name }} memory usage is above 90%"

      - alert: HighErrorRate
        expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High error rate on {{ $labels.service }}"
          description: "{{ $labels.service }} has >5% error rate"
Mount in docker-compose.yml:
services:
  prometheus:
    volumes:
      - ./config/prometheus/alerts.yml:/etc/prometheus/alerts.yml:ro
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.enable-lifecycle'
Update prometheus.yml:
rule_files:
  - 'alerts.yml'

alerting:
  alertmanagers:
    - static_configs:
        - targets: ['alertmanager:9093']

Grafana alerts

  1. Open a dashboard panel
  2. Click “Alert” tab
  3. Create alert rule:
    • Condition: WHEN avg() OF query(A, 5m, now) IS ABOVE 90
    • Frequency: Evaluate every 1m
    • Notification: Send to Slack/Email/PagerDuty

Uptime monitoring

Add Uptime Kuma for HTTP/ping monitoring:
npx create-better-openclaw --services uptime-kuma --yes
Uptime Kuma provides:
  • HTTP(s) monitoring
  • TCP port monitoring
  • Ping/DNS monitoring
  • Status pages
  • Multi-channel notifications (Slack, Discord, Email, etc.)
Access at http://localhost:3001

Performance tuning

Prometheus retention

Adjust data retention in docker-compose.yml:
services:
  prometheus:
    command:
      - '--storage.tsdb.retention.time=30d'  # Keep 30 days
      - '--storage.tsdb.retention.size=10GB'  # Or 10GB max

Scrape interval

Reduce scrape frequency for less critical services:
prometheus.yml
scrape_configs:
  - job_name: "high-priority"
    scrape_interval: 10s
    static_configs:
      - targets: ["openclaw:18789"]

  - job_name: "low-priority"
    scrape_interval: 60s
    static_configs:
      - targets: ["uptime-kuma:3001"]

Grafana performance

config/grafana/grafana.ini
[database]
# Use PostgreSQL instead of SQLite for better performance
type = postgres
host = postgresql:5432
name = grafana
user = grafana
password = ${GF_DATABASE_PASSWORD}

[dashboards]
# Cache dashboard queries
min_refresh_interval = 5s

Troubleshooting

Prometheus not scraping targets

  1. Check Prometheus logs:
    docker compose logs prometheus
    
  2. Verify service is exposing metrics:
    curl http://localhost:9090/metrics
    
  3. Check network connectivity:
    docker compose exec prometheus ping redis
    

Grafana can’t connect to Prometheus

HTTP Error Bad Gateway
Verify datasource URL in Grafana settings:
  • http://prometheus:9090 (Docker network)
  • http://localhost:9090 (won’t work inside container)

High disk usage

Prometheus data grows over time. Clean up old data:
# Check volume size
docker system df -v

# Remove old Prometheus data
docker compose down prometheus
docker volume rm prometheus-data
docker compose up -d prometheus
Or configure retention (see Performance tuning above).

Build docs developers (and LLMs) love