gpumemprof command provides PyTorch GPU memory profiling and analysis tools.
Installation
Install the package to access the CLI:Global usage
Commands
info
Display GPU and system information.--device DEVICE- GPU device ID (default: current device)--detailed- Show detailed information including memory summary
monitor
Monitor memory usage for a specified duration.--device DEVICE- GPU device ID (default: current device)--duration DURATION- Monitoring duration in seconds (default: 10)--interval INTERVAL- Sampling interval in seconds (default: 0.1)--output OUTPUT- Output file for monitoring data--format {csv,json}- Output format (default: csv)
track
Real-time memory tracking with alerts and automatic cleanup options.--device DEVICE- GPU device ID (default: current device)--duration DURATION- Tracking duration in seconds (default: indefinite)--interval INTERVAL- Sampling interval in seconds (default: 0.1)--output OUTPUT- Output file for tracking events--format {csv,json}- Output format (default: csv)--watchdog- Enable automatic memory cleanup--warning-threshold WARNING- Memory warning threshold percentage (default: 80)--critical-threshold CRITICAL- Memory critical threshold percentage (default: 95)--oom-flight-recorder- Enable automatic OOM flight recorder dump artifacts--oom-dump-dir DIR- Directory for OOM dump bundles (default: oom_dumps)--oom-buffer-size SIZE- Ring buffer size for OOM event dumps (default: max tracker events)--oom-max-dumps N- Maximum number of retained OOM dump bundles (default: 5)--oom-max-total-mb MB- Maximum retained OOM dump storage in MB (default: 256)
analyze
Analyze profiling results from previous monitoring or tracking sessions.input_file- Input file with profiling results (required)
--output OUTPUT- Output file for analysis report--format {json,txt}- Output format (default: json)--visualization- Generate visualization plots--plot-dir DIR- Directory for visualization plots (default: plots)
diagnose
Produce a portable diagnostic bundle for debugging memory failures.--output OUTPUT- Output directory for the artifact bundle (default: current working directory)--device DEVICE- GPU device ID (default: current device)--duration DURATION- Seconds to run tracker for telemetry (default: 5, use 0 to skip)--interval INTERVAL- Sampling interval for timeline (default: 0.5)
0- Success, no memory risk detected1- Runtime or argument failure2- Success with memory risk detected
Backend support
Thegpumemprof CLI automatically detects the available backend:
- CUDA - NVIDIA GPUs with CUDA support
- ROCm - AMD GPUs with ROCm support
- MPS - Apple Silicon with Metal Performance Shaders
- CPU - Fallback for systems without GPU support
--device flag is ignored as there is only a single logical device.