Overview
Showdown Trivia uses Prometheus for metrics collection and Grafana for visualization. The monitoring stack is fully containerized and can be started with Docker Compose.Metrics Architecture
Metrics are implemented using the Prometheus Go client library and exposed via the/metrics endpoint.
Metrics Endpoint
URL:http://localhost:8080/metrics
Implementation: internal/web/routes.go:22
Available Metrics
The application exposes two custom metrics defined ininternal/web/metrics/metrics.go.
1. WebSocket Connections (Gauge)
Metric Name:app_websocket_connections
Type: Gauge
Description: Total number of active WebSocket connections
Use Cases:
- Monitor concurrent players
- Detect connection spikes
- Capacity planning
- Alert on unusual connection patterns
2. Request Duration (Histogram)
Metric Name:app_request_game_duration
Type: Histogram
Description: Request duration when creating new game and requesting form
Labels:
method- HTTP method (GET, POST)
- 0.05s, 0.10s, 0.15s, …, 1.00s (20 buckets)
- Track game creation performance
- Identify slow requests
- SLA monitoring
- Detect performance regressions
internal/web/middleware.go:49
GET /create- Display game creation formPOST /create- Process game creation
Metrics Initialization
Metrics are initialized in the application bootstrap (cmd/web/main.go:44):
internal/web/app.go:37):
Prometheus Setup
Configuration
File:deployments/prometheus/prometheus.yml
- Job Name:
app - Target:
app:8080(container name in Docker network) - Scrape Interval: 5 seconds
- Evaluation Interval: 5 seconds
- Metrics Path:
/metrics(default)
Docker Compose Configuration
File:compose.yaml
- URL: http://localhost:9090
- Targets: http://localhost:9090/targets
- Config: http://localhost:9090/config
Grafana Setup
Configuration
Datasource File:deployments/grafana/datasources.yaml
- Prometheus datasource pre-configured
- Automatic provisioning on startup
- No manual datasource setup required
Docker Compose Configuration
- URL: http://localhost:3000
- Username:
admin - Password:
devops123
Starting the Monitoring Stack
Start All Services
- Application (port 8080)
- MongoDB (port 27017)
- Prometheus (port 9090)
- Grafana (port 3000)
Verify Services
Creating Grafana Dashboards
Access Dashboard Editor
- Navigate to http://localhost:3000
- Login with
admin/devops123 - Click Dashboards → New Dashboard
- Click Add visualization
- Select Main datasource (Prometheus)
Example Queries
Active WebSocket Connections
- Current value
- Line chart over time
- Gauge with thresholds
Request Duration - Average
Request Duration - Percentiles
95th Percentile:Request Rate
Requests in SLA (< 200ms)
Sample Dashboard Layout
Row 1: Overview
- Panel 1: Active WebSocket Connections (Stat)
- Panel 2: Request Rate (Stat)
- Panel 3: Average Response Time (Stat)
Row 2: Request Performance
- Panel 4: Request Duration Over Time (Time series)
- Panel 5: Request Duration by Method (Time series)
- Panel 6: Request Duration Heatmap (Heatmap)
Row 3: Latency Breakdown
- Panel 7: P50, P95, P99 Latency (Time series)
- Panel 8: Requests by Duration Bucket (Bar gauge)
Alerting
Prometheus Alert Rules
Createdeployments/prometheus/alerts.yml:
prometheus.yml:
Grafana Alerts
- Create panel with query
- Click Alert tab
- Configure alert condition
- Set notification channel
- Save dashboard
Adding Custom Metrics
Step 1: Define Metric
Editinternal/web/metrics/metrics.go:
Step 2: Instrument Code
In your handler:Step 3: Verify Metric
Best Practices
-
Use Appropriate Metric Types
- Counter: Monotonically increasing (requests, errors)
- Gauge: Can go up or down (connections, memory)
- Histogram: Distributions (latency, response size)
- Summary: Similar to histogram, calculated client-side
-
Label Cardinality
- Keep labels low cardinality
- Avoid user IDs, session IDs as labels
- Use method, status, endpoint as labels
-
Naming Conventions
- Use
<namespace>_<name>_<unit>format - Counters should end with
_total - Use base units (seconds, bytes, not milliseconds)
- Use
-
Dashboard Organization
- Group related metrics
- Use consistent time ranges
- Add descriptions to panels
- Use variables for filtering
-
Alert Tuning
- Set appropriate thresholds
- Use
forclauses to avoid flapping - Test alerts in non-production
- Document alert runbooks
Troubleshooting
Metrics Not Showing
Grafana Can’t Connect to Prometheus
High Cardinality Warning
If Prometheus shows cardinality warnings:- Review metric labels
- Remove high-cardinality labels
- Use recording rules to pre-aggregate