Skip to main content
The Metrics struct provides thread-safe operational telemetry for monitoring OneClaw agent performance. All counters use AtomicU64 for lock-free increments.

Struct Definition

pub struct Metrics {
    pub messages_total: AtomicU64,
    pub messages_secured: AtomicU64,
    pub messages_denied: AtomicU64,
    pub messages_rate_limited: AtomicU64,
    pub llm_calls_total: AtomicU64,
    pub llm_calls_failed: AtomicU64,
    pub llm_tokens_total: AtomicU64,
    pub llm_latency_total_ms: AtomicU64,
    pub memory_stores: AtomicU64,
    pub memory_searches: AtomicU64,
    pub tool_calls_total: AtomicU64,
    pub tool_calls_failed: AtomicU64,
    pub events_published: AtomicU64,
    pub events_processed: AtomicU64,
    pub alerts_triggered: AtomicU64,
    pub chains_executed: AtomicU64,
    pub chain_steps_total: AtomicU64,
    pub errors_total: AtomicU64,
}

Available Metrics

Message Metrics

messages_total
AtomicU64
Total messages received across all channels
messages_secured
AtomicU64
Messages that passed security authorization
messages_denied
AtomicU64
Messages denied by security checks
messages_rate_limited
AtomicU64
Messages rejected by rate limiter (default: 60/min)

LLM Metrics

llm_calls_total
AtomicU64
Total LLM API calls made to providers
llm_calls_failed
AtomicU64
LLM API calls that failed or timed out
llm_tokens_total
AtomicU64
Total tokens consumed across all LLM calls (input + output)
llm_latency_total_ms
AtomicU64
Cumulative LLM latency in milliseconds (use with llm_calls_total to calculate average)

Memory Metrics

memory_stores
AtomicU64
Total memory store operations (“remember” command)
memory_searches
AtomicU64
Total memory search operations (“recall” command, context retrieval)

Tool Metrics

tool_calls_total
AtomicU64
Total tool execution calls
tool_calls_failed
AtomicU64
Tool executions that failed or returned errors

Event Metrics

events_published
AtomicU64
Total events published to the event bus
events_processed
AtomicU64
Total events processed (drained) from the event bus
alerts_triggered
AtomicU64
Total alerts triggered by event handlers

Chain Metrics

chains_executed
AtomicU64
Total chains executed
chain_steps_total
AtomicU64
Total chain steps executed across all chains

Error Metrics

errors_total
AtomicU64
Total errors encountered across all operations

Methods

new

Create a new metrics instance with all counters at zero.
pub fn new() -> Self
return
Self
A new Metrics instance with all counters initialized to 0
Example:
use oneclaw_core::metrics::Metrics;

let metrics = Metrics::new();
assert_eq!(metrics.messages_total.load(Ordering::Relaxed), 0);

inc

Increment a counter by 1 (thread-safe).
pub fn inc(counter: &AtomicU64)
counter
&AtomicU64
required
The counter to increment
Example:
Metrics::inc(&runtime.metrics.messages_total);
Metrics::inc(&runtime.metrics.llm_calls_total);

add

Add a value to a counter (thread-safe).
pub fn add(counter: &AtomicU64, value: u64)
counter
&AtomicU64
required
The counter to increment
value
u64
required
The value to add
Example:
Metrics::add(&runtime.metrics.llm_tokens_total, 150);
Metrics::add(&runtime.metrics.llm_latency_total_ms, 250);

uptime_secs

Get uptime in seconds since boot.
pub fn uptime_secs(&self) -> u64
return
u64
Uptime in seconds
Example:
let uptime = runtime.metrics.uptime_secs();
println!("Agent uptime: {} seconds", uptime);

uptime_display

Get formatted uptime string (e.g., “2h 15m 30s”).
pub fn uptime_display(&self) -> String
return
String
Formatted uptime string
Example:
println!("Uptime: {}", runtime.metrics.uptime_display());
// Output: "Uptime: 2h 15m 30s"

avg_llm_latency_ms

Calculate average LLM latency in milliseconds.
pub fn avg_llm_latency_ms(&self) -> u64
return
u64
Average latency in ms, or 0 if no calls made
Example:
let avg_latency = runtime.metrics.avg_llm_latency_ms();
println!("Average LLM latency: {}ms", avg_latency);

report

Generate a formatted report of all metrics.
pub fn report(&self) -> String
return
String
Multi-line formatted report with all metric categories
Example:
println!("{}", runtime.metrics.report());
Output:
OneClaw Metrics:

  Uptime: 2h 15m 30s

  Messages:
    Total: 145 | Secured: 142 | Denied: 3 | Rate-limited: 0

  LLM:
    Calls: 89 | Failed: 2 | Tokens: 15420 | Avg latency: 245ms

  Memory:
    Stores: 12 | Searches: 45

  Tools:
    Calls: 23 | Failed: 1

  Events:
    Published: 67 | Processed: 67 | Alerts: 5

  Chains:
    Executed: 8 | Steps: 24

  Errors: 3

Accessing Metrics

Metrics are accessible through the Runtime struct:
let runtime = Runtime::from_config(config, workspace)?;

// Access metrics
println!("Total messages: {}", 
    runtime.metrics.messages_total.load(Ordering::Relaxed));

// Increment metrics
Metrics::inc(&runtime.metrics.messages_total);

// Generate report
println!("{}", runtime.metrics.report());

Performance Monitoring

Real-time Monitoring

Spawn a background task to periodically report metrics:
use std::sync::Arc;
use std::time::Duration;
use tokio::time::interval;

let metrics = Arc::clone(&runtime.metrics);

tokio::spawn(async move {
    let mut ticker = interval(Duration::from_secs(60));
    loop {
        ticker.tick().await;
        println!("--- Metrics Report ---");
        println!("{}", metrics.report());
    }
});

Alerting on Thresholds

Monitor failure rates and trigger alerts:
use std::sync::atomic::Ordering;

let total_calls = runtime.metrics.llm_calls_total.load(Ordering::Relaxed);
let failed_calls = runtime.metrics.llm_calls_failed.load(Ordering::Relaxed);

if total_calls > 0 {
    let failure_rate = (failed_calls as f64 / total_calls as f64) * 100.0;
    if failure_rate > 10.0 {
        eprintln!("WARNING: LLM failure rate is {:.1}%", failure_rate);
    }
}

Prometheus Export (Custom)

Export metrics in Prometheus format:
use std::sync::atomic::Ordering;

fn export_prometheus(metrics: &Metrics) -> String {
    let o = Ordering::Relaxed;
    format!(
        "# HELP oneclaw_messages_total Total messages received\n\
         # TYPE oneclaw_messages_total counter\n\
         oneclaw_messages_total {}\n\
         \n\
         # HELP oneclaw_llm_calls_total Total LLM calls\n\
         # TYPE oneclaw_llm_calls_total counter\n\
         oneclaw_llm_calls_total {}\n\
         \n\
         # HELP oneclaw_llm_tokens_total Total LLM tokens\n\
         # TYPE oneclaw_llm_tokens_total counter\n\
         oneclaw_llm_tokens_total {}\n",
        metrics.messages_total.load(o),
        metrics.llm_calls_total.load(o),
        metrics.llm_tokens_total.load(o),
    )
}

Thread Safety

All metrics use AtomicU64 with Ordering::Relaxed for lock-free increments. This provides:
  • Thread-safe updates from any thread
  • No mutex contention or blocking
  • Minimal performance overhead (~1-2 CPU cycles per increment)
  • Eventual consistency (suitable for monitoring, not strict ordering)
Metrics use Ordering::Relaxed because strict ordering is not required for telemetry. All increments are atomic, but the order in which different threads’ updates become visible is not guaranteed.

Command-Line Access

Users can view metrics via the built-in metrics command:
> metrics
OneClaw Metrics:

  Uptime: 2h 15m 30s

  Messages:
    Total: 145 | Secured: 142 | Denied: 3 | Rate-limited: 0
  ...
Or view a subset via the status command:
> status
OneClaw Agent v1.5.0

  Uptime: 2h 15m 30s

  Security: enforced
    Memory: 234 entries (sqlite)
    ...
    Messages: 145 total (3 denied)
    LLM: 89 calls (avg 245ms)

  Type 'health' for detailed check
  Type 'metrics' for full telemetry

Build docs developers (and LLMs) love