Installation
After installing the package, two CLI tools are available:Canonical workflow
Use this sequence for reproducible diagnostics:Inspect environment
Check your system configuration and GPU availability:For detailed GPU information:
PyTorch CLI (gpumemprof)
Thegpumemprof command provides PyTorch-specific memory profiling.
Info command
Display system and GPU information:Monitor command
Monitor memory usage for a specified duration:- CSV output
- JSON output
- Timestamp
- Allocated memory
- Reserved memory
- Device ID
Track command
Real-time memory tracking with alerts and thresholds:Basic tracking
Basic tracking
Track memory without time limit:Press Ctrl+C to stop and save results.
With memory watchdog
With memory watchdog
Enable automatic cleanup on high memory:The watchdog triggers cleanup when thresholds are exceeded.
OOM flight recorder
OOM flight recorder
Capture detailed state on out-of-memory:Configuration:
--oom-dump-dir: Directory for OOM dump bundles--oom-buffer-size: Ring buffer size (default: max events)--oom-max-dumps: Maximum dumps to retain (default: 5)--oom-max-total-mb: Max storage in MB (default: 256)
Analyze command
Analyze profiling results with optional visualization:Diagnose command
Produce a portable diagnostic bundle for debugging:manifest.json- Metadata about the diagnostic runsystem_info.json- Complete system configurationdiagnostic_summary.json- Analysis summarytelemetry_timeline.json- Memory timeline (if duration > 0)requirements.txt- Python package versions
TensorFlow CLI (tfmemprof)
Thetfmemprof command provides TensorFlow-specific memory profiling.
Info command
Display TensorFlow configuration:Monitor command
Monitor TensorFlow GPU memory:Track command
Background memory tracking:Analyze command
Analyze TensorFlow profiling results:- Basic analysis
- Leak detection
- Optimization
- Full report
Diagnose command
Produce TensorFlow diagnostic bundle:Common workflows
Debug OOM errors
Debug OOM errors
Diagnose out-of-memory failures:
Compare PyTorch vs TensorFlow
Compare PyTorch vs TensorFlow
Profile both frameworks:
Continuous monitoring
Continuous monitoring
Set up automated profiling:
Next steps
PyTorch guide
Learn PyTorch-specific profiling APIs
TensorFlow guide
Learn TensorFlow-specific profiling APIs
TUI dashboard
Use the interactive terminal interface
Visualization
Generate plots and dashboards