System Metrics

The Metrics API provides real-time monitoring of system resource utilization within the sandbox environment. Track CPU usage, memory consumption, and other system metrics either as one-time snapshots or continuous streams.

Get System Metrics

curl -X GET http://localhost:44772/metrics \
  -H "X-EXECD-ACCESS-TOKEN: your-token"

Retrieves current system resource metrics including CPU usage percentage, CPU core count, total memory, used memory, and timestamp. Provides a snapshot of system resource utilization at the time of request.

cpu_count

number

Number of CPU cores available

cpu_used_pct

number

CPU usage percentage (0-100)

mem_total_mib

number

Total memory in MiB (Mebibytes)

mem_used_mib

number

Used memory in MiB (Mebibytes)

timestamp

integer

Timestamp when metrics were collected (Unix milliseconds)

{
  "cpu_count": 4.0,
  "cpu_used_pct": 45.5,
  "mem_total_mib": 8192.0,
  "mem_used_mib": 4096.0,
  "timestamp": 1700000000000
}

Watch System Metrics

curl -X GET http://localhost:44772/metrics/watch \
  -H "X-EXECD-ACCESS-TOKEN: your-token"

Streams system resource metrics in real-time using Server-Sent Events (SSE). Updates are sent every second, providing continuous monitoring of CPU usage, memory usage, and other system metrics. The connection remains open until the client disconnects.

The watch endpoint sends metrics updates every second. The connection is persistent and will continue streaming until the client closes it.

cpu_count

number

Number of CPU cores available

cpu_used_pct

number

CPU usage percentage (0-100)

mem_total_mib

number

Total memory in MiB (Mebibytes)

mem_used_mib

number

Used memory in MiB (Mebibytes)

timestamp

integer

Timestamp when metrics were collected (Unix milliseconds)

data: {"cpu_count":4.0,"cpu_used_pct":45.5,"mem_total_mib":8192.0,"mem_used_mib":4096.0,"timestamp":1700000000000}

data: {"cpu_count":4.0,"cpu_used_pct":46.2,"mem_total_mib":8192.0,"mem_used_mib":4102.0,"timestamp":1700000001000}

data: {"cpu_count":4.0,"cpu_used_pct":44.8,"mem_total_mib":8192.0,"mem_used_mib":4098.0,"timestamp":1700000002000}

Understanding the Metrics

CPU Metrics

cpu_count

Number of CPU cores available to the sandbox
This is typically the physical core count, but may be limited by container resource constraints
Does not change during runtime

cpu_used_pct

Percentage of CPU resources currently in use (0-100)
Averaged across all cores
Example: 50% on a 4-core system means ~2 cores worth of work

Memory Metrics

mem_total_mib

Total memory available to the sandbox in Mebibytes (MiB)
1 MiB = 1,048,576 bytes (1024²)
This is the memory limit set for the container

mem_used_mib

Currently used memory in MiB
Includes active processes, cached data, and buffers
When this approaches mem_total_mib, the system may start swapping or OOM

Timestamp

timestamp

Unix timestamp in milliseconds when the metrics were collected
Can be converted to datetime: datetime.fromtimestamp(timestamp / 1000)
Useful for correlating metrics with events

Use Cases

Resource Monitoring Dashboard

Build a real-time dashboard to monitor sandbox health:

import requests
import json
from datetime import datetime

def monitor_resources(url, token, alert_cpu=80, alert_mem=90):
    headers = {"X-EXECD-ACCESS-TOKEN": token}
    response = requests.get(url, headers=headers, stream=True)
    
    for line in response.iter_lines():
        if line and line.decode('utf-8').startswith('data: '):
            metrics = json.loads(line.decode('utf-8')[6:])
            
            mem_pct = (metrics['mem_used_mib'] / metrics['mem_total_mib']) * 100
            timestamp = datetime.fromtimestamp(metrics['timestamp'] / 1000)
            
            # Check for alerts
            if metrics['cpu_used_pct'] > alert_cpu:
                print(f"[{timestamp}] ⚠️  HIGH CPU: {metrics['cpu_used_pct']:.1f}%")
            
            if mem_pct > alert_mem:
                print(f"[{timestamp}] ⚠️  HIGH MEMORY: {mem_pct:.1f}%")

monitor_resources(
    "http://localhost:44772/metrics/watch",
    "your-token",
    alert_cpu=80,
    alert_mem=90
)

Performance Testing

Collect metrics during load tests to understand resource consumption:

import requests
import json
import threading
import time
from statistics import mean, stdev

class MetricsCollector:
    def __init__(self, url, token):
        self.url = url
        self.headers = {"X-EXECD-ACCESS-TOKEN": token}
        self.cpu_samples = []
        self.mem_samples = []
        self.running = False
    
    def start(self):
        self.running = True
        thread = threading.Thread(target=self._collect)
        thread.daemon = True
        thread.start()
        return self
    
    def stop(self):
        self.running = False
        return self.get_stats()
    
    def _collect(self):
        response = requests.get(self.url, headers=self.headers, stream=True)
        for line in response.iter_lines():
            if not self.running:
                break
            if line and line.decode('utf-8').startswith('data: '):
                metrics = json.loads(line.decode('utf-8')[6:])
                self.cpu_samples.append(metrics['cpu_used_pct'])
                mem_pct = (metrics['mem_used_mib'] / metrics['mem_total_mib']) * 100
                self.mem_samples.append(mem_pct)
    
    def get_stats(self):
        return {
            "cpu": {
                "mean": mean(self.cpu_samples),
                "max": max(self.cpu_samples),
                "min": min(self.cpu_samples),
                "stdev": stdev(self.cpu_samples) if len(self.cpu_samples) > 1 else 0
            },
            "memory": {
                "mean": mean(self.mem_samples),
                "max": max(self.mem_samples),
                "min": min(self.mem_samples),
                "stdev": stdev(self.mem_samples) if len(self.mem_samples) > 1 else 0
            }
        }

# Usage
collector = MetricsCollector(
    "http://localhost:44772/metrics/watch",
    "your-token"
).start()

# Run your load test here
time.sleep(60)

# Stop collecting and get stats
stats = collector.stop()
print(f"CPU - Mean: {stats['cpu']['mean']:.1f}%, Max: {stats['cpu']['max']:.1f}%")
print(f"Memory - Mean: {stats['memory']['mean']:.1f}%, Max: {stats['memory']['max']:.1f}%")

Auto-scaling Trigger

Use metrics to trigger auto-scaling decisions:

import requests
import json
from collections import deque

class AutoScaler:
    def __init__(self, url, token, window_size=10):
        self.url = url
        self.headers = {"X-EXECD-ACCESS-TOKEN": token}
        self.cpu_window = deque(maxlen=window_size)
        self.mem_window = deque(maxlen=window_size)
    
    def monitor(self, scale_up_cpu=70, scale_down_cpu=30,
                scale_up_mem=80, scale_down_mem=40):
        response = requests.get(self.url, headers=self.headers, stream=True)
        
        for line in response.iter_lines():
            if line and line.decode('utf-8').startswith('data: '):
                metrics = json.loads(line.decode('utf-8')[6:])
                
                self.cpu_window.append(metrics['cpu_used_pct'])
                mem_pct = (metrics['mem_used_mib'] / metrics['mem_total_mib']) * 100
                self.mem_window.append(mem_pct)
                
                if len(self.cpu_window) == self.cpu_window.maxlen:
                    avg_cpu = sum(self.cpu_window) / len(self.cpu_window)
                    avg_mem = sum(self.mem_window) / len(self.mem_window)
                    
                    if avg_cpu > scale_up_cpu or avg_mem > scale_up_mem:
                        self.scale_up(avg_cpu, avg_mem)
                    elif avg_cpu < scale_down_cpu and avg_mem < scale_down_mem:
                        self.scale_down(avg_cpu, avg_mem)
    
    def scale_up(self, cpu, mem):
        print(f"🔼 Scaling UP - CPU: {cpu:.1f}%, Memory: {mem:.1f}%")
        # Implement scale-up logic
    
    def scale_down(self, cpu, mem):
        print(f"🔽 Scaling DOWN - CPU: {cpu:.1f}%, Memory: {mem:.1f}%")
        # Implement scale-down logic

scaler = AutoScaler(
    "http://localhost:44772/metrics/watch",
    "your-token",
    window_size=10  # Average over 10 seconds
)
scaler.monitor()

Capacity Planning

Collect metrics over time to understand usage patterns:

import requests
import json
import time
from datetime import datetime

class MetricsLogger:
    def __init__(self, url, token, log_file="metrics.jsonl"):
        self.url = url
        self.headers = {"X-EXECD-ACCESS-TOKEN": token}
        self.log_file = log_file
    
    def log_metrics(self, duration_seconds=3600):
        """Log metrics for a specified duration"""
        start_time = time.time()
        response = requests.get(self.url, headers=self.headers, stream=True)
        
        with open(self.log_file, 'a') as f:
            for line in response.iter_lines():
                if time.time() - start_time > duration_seconds:
                    break
                
                if line and line.decode('utf-8').startswith('data: '):
                    metrics = json.loads(line.decode('utf-8')[6:])
                    log_entry = {
                        "timestamp_iso": datetime.fromtimestamp(
                            metrics['timestamp'] / 1000
                        ).isoformat(),
                        **metrics
                    }
                    f.write(json.dumps(log_entry) + '\n')
                    f.flush()

logger = MetricsLogger(
    "http://localhost:44772/metrics/watch",
    "your-token",
    log_file="sandbox_metrics.jsonl"
)
logger.log_metrics(duration_seconds=3600)  # Log for 1 hour

Memory Units

The API returns memory in MiB (Mebibytes), not MB (Megabytes):

Unit	Bytes	Comparison
1 MiB	1,048,576 bytes	2²⁰ bytes
1 MB	1,000,000 bytes	10⁶ bytes

Conversion:

MiB to MB: mib * 1.048576
MB to MiB: mb / 1.048576
MiB to GiB: mib / 1024
GiB to MiB: gib * 1024

Best Practices

Polling vs Streaming

Use GET /metrics when:

You need a one-time snapshot
Building health checks or periodic monitoring
Implementing simple alerting systems

Use GET /metrics/watch when:

Building real-time dashboards
Continuous monitoring and alerting
Performance profiling during tests
Auto-scaling based on live metrics

Connection Management

When using the watch endpoint:

Handle reconnection: Network issues can close SSE connections
Set timeouts: Prevent hanging connections
Graceful shutdown: Close connections properly when done
Error handling: Handle parsing errors and missing data

import requests
import json
import time

def watch_metrics_with_reconnect(url, token, max_retries=3):
    retries = 0
    
    while retries < max_retries:
        try:
            response = requests.get(
                url,
                headers={"X-EXECD-ACCESS-TOKEN": token},
                stream=True,
                timeout=30
            )
            response.raise_for_status()
            
            for line in response.iter_lines():
                if line:
                    decoded = line.decode('utf-8')
                    if decoded.startswith('data: '):
                        try:
                            metrics = json.loads(decoded[6:])
                            yield metrics
                            retries = 0  # Reset on success
                        except json.JSONDecodeError:
                            print("Error parsing metrics")
        
        except (requests.exceptions.RequestException, 
                requests.exceptions.Timeout) as e:
            retries += 1
            print(f"Connection error (retry {retries}/{max_retries}): {e}")
            if retries < max_retries:
                time.sleep(5)  # Wait before retry
            else:
                raise

# Usage
for metrics in watch_metrics_with_reconnect(
    "http://localhost:44772/metrics/watch",
    "your-token"
):
    print(f"CPU: {metrics['cpu_used_pct']:.1f}%")

Lifecycle API

Execution API

SDK Reference

Get System Metrics

Watch System Metrics

Understanding the Metrics

CPU Metrics

Memory Metrics

Timestamp

Use Cases

Resource Monitoring Dashboard

Performance Testing

Auto-scaling Trigger

Capacity Planning

Memory Units

Best Practices

Polling vs Streaming

Connection Management

Build docs developers (and LLMs) love

Lifecycle API

Execution API

SDK Reference

​Get System Metrics

​Watch System Metrics

​Understanding the Metrics

​CPU Metrics

​Memory Metrics

​Timestamp

​Use Cases

​Resource Monitoring Dashboard

​Performance Testing

​Auto-scaling Trigger

​Capacity Planning

​Memory Units

​Best Practices

​Polling vs Streaming

​Connection Management

Build docs developers (and LLMs) love

Get System Metrics

Watch System Metrics

Understanding the Metrics

CPU Metrics

Memory Metrics

Timestamp

Use Cases

Resource Monitoring Dashboard

Performance Testing

Auto-scaling Trigger

Capacity Planning

Memory Units

Best Practices

Polling vs Streaming

Connection Management