tfmemprof command provides TensorFlow GPU memory profiling and analysis tools.
Installation
Install the package with TensorFlow support:Global usage
-v, --verbose- Enable verbose logging
Commands
info
Display system and GPU information for TensorFlow.monitor
Monitor GPU memory usage in real-time.--interval INTERVAL- Sampling interval in seconds (default: 1.0)--duration DURATION- Monitoring duration in seconds (default: indefinite)--threshold THRESHOLD- Memory alert threshold in MB--device DEVICE- TensorFlow device to monitor (default: /GPU:0)--output OUTPUT- Output file for results-v, --verbose- Enable verbose logging
track
Start background memory tracking with alert callbacks.--output OUTPUT- Output file for tracking results (required)--interval INTERVAL- Sampling interval in seconds (default: 1.0)--threshold THRESHOLD- Memory alert threshold in MB (default: 4000)--device DEVICE- TensorFlow device to monitor (default: /GPU:0)-v, --verbose- Enable verbose logging
analyze
Analyze profiling results from previous sessions.--input INPUT- Input file with profiling results (required)--detect-leaks- Detect memory leaks--optimize- Generate optimization recommendations--visualize- Generate visualization plots--report REPORT- Generate comprehensive report file-v, --verbose- Enable verbose logging
diagnose
Produce a portable diagnostic bundle for debugging memory failures.--output OUTPUT- Output directory for the artifact bundle (default: current working directory)--device DEVICE- TensorFlow device to monitor (default: /GPU:0)--duration DURATION- Seconds to run tracker for telemetry (default: 5, use 0 to skip)--interval INTERVAL- Sampling interval for timeline (default: 0.5)-v, --verbose- Enable verbose logging
0- Success, no memory risk detected1- Runtime or argument failure2- Success with memory risk detected
TensorFlow device notation
TensorFlow uses a specific device notation:/GPU:0- First GPU device (default)/GPU:1- Second GPU device/CPU:0- CPU device
--device flag accepts this notation.
Backend support
Thetfmemprof CLI supports multiple TensorFlow backends:
- CUDA - NVIDIA GPUs with CUDA support
- ROCm - AMD GPUs with ROCm support
- Metal - Apple Silicon with tensorflow-metal
- CPU - Fallback for systems without GPU support
tensorflow-metal to enable GPU acceleration: