Skip to main content
Draft Thinker ships a pre-built Grafana dashboard that covers all Prometheus metrics exposed by the gateway. The dashboard and its Prometheus datasource are provisioned automatically when you start the stack with Docker Compose — no manual import or configuration is required.

Accessing the dashboard

1

Start the stack

Run Docker Compose from the repository root. This starts the gateway, Redis, Qdrant, Prometheus, and Grafana together.
docker compose up
2

Open Grafana

Navigate to http://localhost:3000 in your browser.
3

Sign in

Log in with the default credentials:
FieldValue
Usernameadmin
Passwordadmin
4

Open the dashboard

The Draft-Thinker Gateway dashboard is pre-provisioned and available immediately under Dashboards in the left sidebar. The dashboard auto-refreshes every 10 seconds and defaults to a 15-minute time window.
The default admin password is set via GF_SECURITY_ADMIN_PASSWORD=admin in docker-compose.yml. Change this value before deploying to any shared or production environment.

How provisioning works

Grafana is configured entirely through volume mounts defined in docker-compose.yml. No manual setup steps are needed.
docker-compose.yml (grafana service)
grafana:
  image: grafana/grafana:11.5.2
  ports:
    - "3000:3000"
  environment:
    - GF_SECURITY_ADMIN_PASSWORD=admin
  volumes:
    - ./grafana/provisioning:/etc/grafana/provisioning:ro
    - ./grafana/dashboards:/var/lib/grafana/dashboards:ro
  depends_on:
    - prometheus
Two provisioning directories are mounted:
  • grafana/provisioning/datasources/ — Configures the Prometheus datasource pointing at http://prometheus:9090, set as the default datasource.
  • grafana/provisioning/dashboards/ — Tells Grafana to load dashboard JSON files from /var/lib/grafana/dashboards, which maps to grafana/dashboards/ in the repository.
The dashboard JSON lives at grafana/dashboards/draft-thinker.json. To add or modify panels, edit that file directly and restart the Grafana container, or use the Grafana UI (the dashboard is marked editable: true).

Prometheus scrape configuration

Prometheus is configured to scrape the gateway’s /metrics endpoint every 15 seconds.
prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: draft-thinker
    static_configs:
      - targets:
          - gateway:8080
    metrics_path: /metrics
The target gateway:8080 resolves to the gateway container over the Docker Compose internal network.

Dashboard panels

The dashboard is organized into four row sections. Each row groups related panels.

Overview

Four summary stats shown at the top of the dashboard:
  • Request ratesum(rate(draftthinker_requests_total[5m])) in requests per second
  • Draft acceptance rate — fraction of routing decisions where the drafter’s response was accepted, shown as a gauge with green/yellow/red thresholds
  • Cost reduction — calibrated value (91.6%) at threshold T=2.0 on 518 prompts
  • Cache hit rate — fraction of requests served from semantic cache, shown as a gauge

Latency

Three time-series panels showing P50, P95, and P99 percentiles:
  • Upstream latency by providerdraftthinker_upstream_latency_seconds broken out by drafter and heavyweight
  • Cache lookup latencydraftthinker_cache_lookup_latency_seconds end-to-end including embedding and vector search
  • Speculative latency saveddraftthinker_speculative_latency_saved_seconds P50 and P95

Routing

Two time-series panels showing decision flow and errors:
  • Routing decisions over timedraftthinker_routing_decisions_total stacked by accept, escalate, and cache_hit
  • Error rate by typedraftthinker_errors_total broken out by type label (invalid_request, routing_error, upstream_error, upstream_timeout, stream_error, internal_error)

Entropy and speculative

Three panels covering the routing engine internals:
  • Entropy distributiondraftthinker_entropy_distribution rendered as a heatmap over time, showing the per-token Shannon entropy spread in bits
  • Speculative trigger raterate(draftthinker_speculative_triggers_total[5m]) in triggers per second
  • Speculative cancellation ratiodraftthinker_speculative_cancellations_total / draftthinker_speculative_triggers_total as a gauge, green below 10%, yellow to 30%, red above

Build docs developers (and LLMs) love