Overview
Firedancer maintains many internal performance counters for use by developers and monitoring tools, and exposes them via a Prometheus HTTP endpoint.Configuration
Configure the Prometheus metrics endpoint in yourconfig.toml:
config.toml
Accessing Metrics
Once configured, you can query the metrics endpoint using curl or any Prometheus-compatible scraper:Metrics are currently only provided for developer and diagnostic use. The endpoint or data provided may break or change in incompatible ways at any time.
Metric Types
Firedancer reports three metric types following the Prometheus data model:A cumulative metric representing a monotonically increasing counter.
A single numerical value that can go arbitrarily up or down.
Samples observations like packet sizes and counts them in buckets.
Available Metrics
Link Metrics
Metrics for all inter-tile communication links:The number of times the link reader has consumed a fragment.
The total number of bytes read by the link consumer.
The number of fragments that were filtered and not consumed.
The total number of bytes read by the link consumer that were filtered.
The number of times the link has been overrun while polling.
The number of input overruns detected while reading metadata by the consumer.
The number of times the consumer was detected as rate limiting by the producer.
Tile Metrics
Metrics available for all tiles:The process ID of the tile.
The thread ID of the tile. Always the same as the PID in production, but might be different in development.
Index of the CPU last executed on.
The number of involuntary context switches.
The number of voluntary context switches.
The number of major page faults.
The number of minor page faults.
The current status of the tile: 0 is booting, 1 is running, 2 is shutdown.
The last UNIX timestamp in nanoseconds that the tile heartbeated.
Whether the tile is currently backpressured or not, either 1 or 0.
Number of times the tile has had to wait for one or more consumers to catch up to resume publishing.
Tile-Specific Metrics
IPEcho Tile
The current shred version used by the validator.
The number of active connections to the ipecho service.
The number of connections that have been made and closed normally.
The number of connections that have been made and closed abnormally.
Snapshot Control Tile
State of the snapshot control tile.
Number of bytes read so far from the full snapshot. Might decrease if snapshot load is aborted and restarted.
Total size of the full snapshot file.
Number of bytes read so far from the incremental snapshot.
The predicted slot from which replay starts after snapshot loading finishes.
Integration with Monitoring Tools
Prometheus
Add Firedancer as a scrape target in yourprometheus.yml:
prometheus.yml
Grafana
Once metrics are being scraped by Prometheus, you can create Grafana dashboards to visualize:- Tile health and backpressure
- Link throughput and overruns
- Context switches and page faults
- Snapshot loading progress
For a complete list of all available metrics including tile-specific counters for net, quic, verify, dedup, pack, bank, poh, shred, store, sign, and other tiles, refer to the full metrics output from your running validator.