Skip to main content
The telemetry module provides a standardized schema for memory profiling events and utilities for converting legacy formats.

Constants

SCHEMA_VERSION_V2
Literal[2]
default:"2"
Current telemetry schema version
UNKNOWN_PID
int
default:"-1"
Sentinel value for unknown process ID
UNKNOWN_HOST
str
default:"'unknown'"
Sentinel value for unknown hostname

Classes

TelemetryEventV2

Canonical telemetry event payload used by tracker exports.
from gpumemprof.telemetry import TelemetryEventV2

event = TelemetryEventV2(
    schema_version=2,
    timestamp_ns=1709481600000000000,
    event_type="allocation",
    collector="gpumemprof.cuda_tracker",
    sampling_interval_ms=100,
    pid=12345,
    host="gpu-server-01",
    device_id=0,
    allocator_allocated_bytes=5368709120,
    allocator_reserved_bytes=6442450944,
    allocator_active_bytes=5100273664,
    allocator_inactive_bytes=268435456,
    allocator_change_bytes=134217728,
    device_used_bytes=5500000000,
    device_free_bytes=11000000000,
    device_total_bytes=16500000000,
    context="training_step",
    metadata={"batch_size": 32, "epoch": 5}
)

Attributes

schema_version
Literal[2]
Schema version (always 2)
timestamp_ns
int
Unix timestamp in nanoseconds
event_type
str
Event type (e.g., “allocation”, “deallocation”, “peak”, “warning”)
collector
str
Collector identifier (e.g., “gpumemprof.cuda_tracker”)
sampling_interval_ms
int
Sampling interval in milliseconds
pid
int
Process ID
host
str
Hostname
device_id
int
Device ID (-1 for CPU)
allocator_allocated_bytes
int
Bytes allocated by the memory allocator
allocator_reserved_bytes
int
Bytes reserved by the memory allocator
allocator_active_bytes
Optional[int]
Active bytes in the allocator
allocator_inactive_bytes
Optional[int]
Inactive bytes in the allocator
allocator_change_bytes
int
Memory change in bytes since last event
device_used_bytes
int
Total device memory used
device_free_bytes
Optional[int]
Free device memory
device_total_bytes
Optional[int]
Total device memory
context
Optional[str]
Contextual information
metadata
Dict[str, Any]
default:"{}"
Additional metadata

Functions

telemetry_event_to_dict()

Serialize a telemetry event to a plain dictionary.
from gpumemprof.telemetry import telemetry_event_to_dict

event_dict = telemetry_event_to_dict(event)
# Returns dict with all event fields
event
TelemetryEventV2
Event to serialize
return
Dict[str, Any]
Dictionary representation of the event

validate_telemetry_record()

Validate a v2 telemetry record.
from gpumemprof.telemetry import validate_telemetry_record

record = {
    "schema_version": 2,
    "timestamp_ns": 1709481600000000000,
    "event_type": "allocation",
    # ... all required fields
}

try:
    validate_telemetry_record(record)
    print("Record is valid")
except ValueError as e:
    print(f"Validation error: {e}")
record
Dict[str, Any]
Record to validate
Raises ValueError if the record is invalid or has missing fields.

telemetry_event_from_record()

Create a v2 telemetry event from v2 or legacy tracker records.
from gpumemprof.telemetry import telemetry_event_from_record

# From v2 record
v2_record = {
    "schema_version": 2,
    "timestamp_ns": 1709481600000000000,
    "event_type": "allocation",
    # ... other fields
}
event = telemetry_event_from_record(v2_record)

# From legacy record (auto-converted)
legacy_record = {
    "timestamp": 1709481600.0,
    "memory_allocated": 5368709120,
    "memory_reserved": 6442450944,
    "device": "cuda:0"
}
event = telemetry_event_from_record(
    legacy_record,
    permissive_legacy=True,
    default_collector="gpumemprof.cuda_tracker"
)
record
Dict[str, Any]
Record to convert (v2 or legacy format)
permissive_legacy
bool
default:"True"
Whether to allow legacy format conversion
default_collector
str
default:"'legacy.unknown'"
Default collector name for legacy records
default_sampling_interval_ms
int
default:"0"
Default sampling interval for legacy records
return
TelemetryEventV2
Normalized telemetry event

load_telemetry_events()

Load telemetry events from JSON and normalize to v2 payloads.
from gpumemprof.telemetry import load_telemetry_events

# Load from file
events = load_telemetry_events("memory_events.json")

# With custom events key
events = load_telemetry_events(
    "export.json",
    events_key="profiling_data"
)

# Strict mode (no legacy conversion)
events = load_telemetry_events(
    "v2_events.json",
    permissive_legacy=False
)

print(f"Loaded {len(events)} events")
for event in events[:5]:
    print(f"  {event.event_type}: {event.allocator_allocated_bytes} bytes")
path
Union[str, Path]
Path to JSON file containing telemetry events
permissive_legacy
bool
default:"True"
Whether to allow legacy format conversion
events_key
Optional[str]
default:"None"
JSON key containing the events array (auto-detects if None)
return
List[TelemetryEventV2]
List of normalized telemetry events
Supported JSON formats:
  • Array of events: [{event1}, {event2}, ...]
  • Object with events key: {"events": [{event1}, {event2}]}
  • Single event object: {event}
  • Custom key: {"custom_key": [{event1}, {event2}]}

Required Fields

All v2 telemetry records must include these fields:
  • schema_version: Must be 2
  • timestamp_ns: Unix timestamp in nanoseconds (>= 0)
  • event_type: Non-empty string
  • collector: Non-empty string
  • sampling_interval_ms: Integer >= 0
  • pid: Process ID (>= -1)
  • host: Non-empty hostname
  • device_id: Device identifier
  • allocator_allocated_bytes: Allocated bytes (>= 0)
  • allocator_reserved_bytes: Reserved bytes (>= 0)
  • allocator_active_bytes: Active bytes (>= 0 or null)
  • allocator_inactive_bytes: Inactive bytes (>= 0 or null)
  • allocator_change_bytes: Memory change
  • device_used_bytes: Device memory used (>= 0)
  • device_free_bytes: Device memory free (>= 0 or null)
  • device_total_bytes: Total device memory (>= 0 or null)
  • context: Context string (or null)
  • metadata: Metadata dictionary

Legacy Format Conversion

The module automatically converts legacy formats from:
  • PyTorch GPU profiler
  • TensorFlow memory profiler
  • Custom tracking events
Legacy field mappings:
  • timestamp (seconds) → timestamp_ns (nanoseconds)
  • memory_allocatedallocator_allocated_bytes
  • memory_reservedallocator_reserved_bytes
  • memory_changeallocator_change_bytes
  • device string → device_id integer
  • backendcollector inference

Example Usage

import json
from gpumemprof.telemetry import (
    TelemetryEventV2,
    telemetry_event_to_dict,
    telemetry_event_from_record,
    load_telemetry_events,
    validate_telemetry_record
)

# Create event from scratch
event = TelemetryEventV2(
    schema_version=2,
    timestamp_ns=int(time.time() * 1e9),
    event_type="allocation",
    collector="gpumemprof.cuda_tracker",
    sampling_interval_ms=100,
    pid=os.getpid(),
    host=socket.gethostname(),
    device_id=0,
    allocator_allocated_bytes=5 * 1024**3,
    allocator_reserved_bytes=6 * 1024**3,
    allocator_active_bytes=4.8 * 1024**3,
    allocator_inactive_bytes=200 * 1024**2,
    allocator_change_bytes=512 * 1024**2,
    device_used_bytes=5.5 * 1024**3,
    device_free_bytes=10.5 * 1024**3,
    device_total_bytes=16 * 1024**3,
    context="training_batch_47",
    metadata={"batch_size": 32, "learning_rate": 0.001}
)

# Convert to dict
event_dict = telemetry_event_to_dict(event)

# Validate
validate_telemetry_record(event_dict)

# Save to file
with open("events.json", "w") as f:
    json.dump([event_dict], f, indent=2)

# Load and convert
events = load_telemetry_events("events.json")
for event in events:
    gb_allocated = event.allocator_allocated_bytes / 1024**3
    print(f"{event.event_type}: {gb_allocated:.2f} GB allocated")

# Convert legacy format
legacy_data = {
    "timestamp": 1709481600.0,
    "memory_allocated": 5368709120,
    "memory_reserved": 6442450944,
    "device": "cuda:0",
    "backend": "cuda"
}

event = telemetry_event_from_record(
    legacy_data,
    default_collector="gpumemprof.cuda_tracker",
    default_sampling_interval_ms=100
)

print(f"Converted event: {event.timestamp_ns} ns")
print(f"Allocated: {event.allocator_allocated_bytes / 1024**3:.2f} GB")

Integration with Tracker

The telemetry schema is used by MemoryTracker.export_events():
from gpumemprof import MemoryTracker

tracker = MemoryTracker(device="cuda:0")
tracker.start_tracking()

# ... run code ...

tracker.stop_tracking()

# Export uses TelemetryEventV2 format
tracker.export_events("memory_events.json", format="json")

# Load and analyze
from gpumemprof.telemetry import load_telemetry_events

events = load_telemetry_events("memory_events.json")
for event in events:
    if event.event_type == "warning":
        print(f"Warning at {event.timestamp_ns}: {event.context}")

Build docs developers (and LLMs) love