Skip to main content
This matrix reflects the current behavior of the repository and is versioned for the v0.2 documentation refresh.
Matrix version: v0.2
Last verified: 2026-02-16

Source of truth

  • pyproject.toml (requires-python >=3.10, framework dependency floors)
  • CLI entry points in pyproject.toml
  • Runtime backend detection in gpumemprof/device_collectors.py and tfmemprof/utils.py
  • CI smoke checks in .github/workflows/ci.yml

Runtime and version support

SurfaceSupportedNotes
Python3.10, 3.11, 3.12Python 3.8/3.9 are no longer supported for v0.2
PyTorch package floortorch>=1.8.0Runtime backend can be CUDA, ROCm, MPS, or CPU
TensorFlow package floortensorflow>=2.4.0Runtime backend can be CUDA, ROCm, Metal, or CPU
OSLinux, macOS, WindowsBackend availability depends on installed framework/runtime
The minimum PyTorch version is 1.8.0, but newer versions are recommended for full feature support.

CLI feature availability by environment

CUDA (NVIDIA)

All CLI commands fully supported:
  • gpumemprof info/monitor/track/analyze/diagnose
  • tfmemprof info/monitor/track/analyze/diagnose

ROCm (AMD Linux)

All CLI commands fully supported:
  • gpumemprof info/monitor/track/analyze/diagnose
  • tfmemprof info/monitor/track/analyze/diagnose

Apple Metal / MPS

All CLI commands fully supported:
  • gpumemprof info/monitor/track/analyze/diagnose
  • tfmemprof info/monitor/track/analyze/diagnose

CPU-only host

All CLI commands with CPU fallback:
  • gpumemprof info/monitor/track/analyze/diagnose
  • tfmemprof info/monitor/track/analyze/diagnose

Complete CLI availability table

Environmentgpumemprof infogpumemprof monitorgpumemprof trackgpumemprof analyzegpumemprof diagnosetfmemprof infotfmemprof monitortfmemprof tracktfmemprof analyzetfmemprof diagnose
CUDA (NVIDIA)
ROCm (AMD Linux)
Apple Metal / MPS
CPU-only host✅ (CPU fallback)✅ (CPU fallback)

Backend capability details

PyTorch (gpumemprof)

Runtime backendTypical platformTelemetry collectordevice_total/free supportNotes
cudaNVIDIA + CUDAgpumemprof.cuda_trackerUses torch.cuda.memory_*
rocmAMD + ROCm (Linux)gpumemprof.rocm_trackerUses HIP-backed torch.cuda.memory_*
mpsApple Silicon (macOS)gpumemprof.mps_trackerPartialDepends on torch.mps.recommended_max_memory() availability
cpuAny hostgpumemprof.cpu_trackerN/ACPUMemoryProfiler / CPUMemoryTracker fallback
Full support for device memory queries including:
  • Total device memory
  • Free device memory
  • Allocated memory
  • Reserved memory
  • Active/inactive segments
Implementation: Uses torch.cuda.memory_* APIs
Full support through HIP-backed PyTorch CUDA APIs:
  • Same API surface as CUDA backend
  • Automatic detection via detect_torch_runtime_backend()
  • AMD GPU memory management through ROCm
Implementation: Uses HIP-backed torch.cuda.memory_* APIs
Partial support for Apple Silicon:
  • Memory allocation tracking
  • Limited device total/free memory
  • Depends on torch.mps.recommended_max_memory() availability
Implementation: Uses torch.mps APIs when available
CPU-only mode when no GPU is available:
  • Tracks system memory usage
  • No device memory queries
  • Full profiling and analysis features still available
Implementation: Uses psutil for memory monitoring

TensorFlow (tfmemprof)

Runtime backendTypical platformTelemetry collectorNotes
cudaNVIDIA + CUDAtfmemprof.memory_trackerBuild/runtime diagnostics shown in tfmemprof info
rocmAMD + ROCm (Linux)tfmemprof.memory_trackerBuild/runtime diagnostics shown in tfmemprof info
metalApple Silicontfmemprof.memory_trackerCounters can be runtime-dependent on Metal stack
cpuAny hosttfmemprof.memory_trackerFull CLI surface remains available
Full support for CUDA devices:
  • GPU memory profiling
  • Session-based tracking
  • Keras model profiling
Check build info: Run tfmemprof info to see CUDA build configuration
Full support for AMD GPUs:
  • ROCm-enabled TensorFlow builds
  • Same profiling capabilities as CUDA
Check build info: Run tfmemprof info to see ROCm build configuration
Apple Silicon support:
  • Native Metal GPU acceleration
  • Memory tracking dependent on Metal stack
  • Use tensorflow-metal for GPU acceleration
Installation:
pip install tensorflow tensorflow-metal
CPU-only mode:
  • Full CLI surface available
  • Memory profiling for CPU operations
  • Session and graph execution monitoring

Backend capability metadata

Tracker exports may include backend capability hints under metadata:
{
  "backend": "cuda",
  "supports_device_total": true,
  "supports_device_free": true,
  "sampling_source": "torch.cuda.memory_stats"
}

Metadata fields

FieldTypeDescription
backendstringBackend type: "cuda", "rocm", "mps", "cpu", "metal"
supports_device_totalbooleanWhether total device memory query is supported
supports_device_freebooleanWhether free device memory query is supported
sampling_sourcestringSource of memory samples (e.g., torch.cuda.memory_stats)

Validation notes

1

Documentation linkage

The compatibility matrix is linked from README.md and docs/index.md
2

CLI examples validation

CLI examples smoke validation is part of CI (examples.cli.quickstart) in .github/workflows/ci.yml
3

Backend metadata

Backend capability metadata is emitted in tracker exports for analysis and debugging

Platform-specific considerations

Linux

Best support for CUDA and ROCm backends
Full feature availability
Recommended for production use

macOS

MPS support on Apple Silicon
Metal support for TensorFlow
CUDA not available - use MPS or CPU fallback

Windows

CUDA support available
Full PyTorch and TensorFlow support
ROCm support limited on Windows

Installation examples

# Install PyTorch with CUDA
pip install torch --index-url https://download.pytorch.org/whl/cu118

# Install TensorFlow (includes GPU support)
pip install tensorflow

# Install GPU Memory Profiler
pip install gpu-memory-profiler

See also

Build docs developers (and LLMs) love