Compatibility

This matrix reflects the current behavior of the repository and is versioned for the v0.2 documentation refresh.

Matrix version: v0.2
Last verified: 2026-02-16

Source of truth

pyproject.toml (requires-python >=3.10, framework dependency floors)
CLI entry points in pyproject.toml
Runtime backend detection in gpumemprof/device_collectors.py and tfmemprof/utils.py
CI smoke checks in .github/workflows/ci.yml

Runtime and version support

Surface	Supported	Notes
Python	3.10, 3.11, 3.12	Python 3.8/3.9 are no longer supported for v0.2
PyTorch package floor	`torch>=1.8.0`	Runtime backend can be CUDA, ROCm, MPS, or CPU
TensorFlow package floor	`tensorflow>=2.4.0`	Runtime backend can be CUDA, ROCm, Metal, or CPU
OS	Linux, macOS, Windows	Backend availability depends on installed framework/runtime

The minimum PyTorch version is 1.8.0, but newer versions are recommended for full feature support.

CLI feature availability by environment

CUDA (NVIDIA)

All CLI commands fully supported:

gpumemprof info/monitor/track/analyze/diagnose
tfmemprof info/monitor/track/analyze/diagnose

ROCm (AMD Linux)

All CLI commands fully supported:

gpumemprof info/monitor/track/analyze/diagnose
tfmemprof info/monitor/track/analyze/diagnose

Apple Metal / MPS

All CLI commands fully supported:

gpumemprof info/monitor/track/analyze/diagnose
tfmemprof info/monitor/track/analyze/diagnose

CPU-only host

All CLI commands with CPU fallback:

gpumemprof info/monitor/track/analyze/diagnose
tfmemprof info/monitor/track/analyze/diagnose

Complete CLI availability table

Environment	`gpumemprof info`	`gpumemprof monitor`	`gpumemprof track`	`gpumemprof analyze`	`gpumemprof diagnose`	`tfmemprof info`	`tfmemprof monitor`	`tfmemprof track`	`tfmemprof analyze`	`tfmemprof diagnose`
CUDA (NVIDIA)	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
ROCm (AMD Linux)	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
Apple Metal / MPS	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
CPU-only host	✅	✅ (CPU fallback)	✅ (CPU fallback)	✅	✅	✅	✅	✅	✅	✅

Backend capability details

PyTorch (gpumemprof)

Runtime backend	Typical platform	Telemetry collector	`device_total/free` support	Notes
`cuda`	NVIDIA + CUDA	`gpumemprof.cuda_tracker`	✅	Uses `torch.cuda.memory_*`
`rocm`	AMD + ROCm (Linux)	`gpumemprof.rocm_tracker`	✅	Uses HIP-backed `torch.cuda.memory_*`
`mps`	Apple Silicon (macOS)	`gpumemprof.mps_tracker`	Partial	Depends on `torch.mps.recommended_max_memory()` availability
`cpu`	Any host	`gpumemprof.cpu_tracker`	N/A	`CPUMemoryProfiler` / `CPUMemoryTracker` fallback

CUDA backend details

Full support for device memory queries including:

Total device memory
Free device memory
Allocated memory
Reserved memory
Active/inactive segments

Implementation: Uses torch.cuda.memory_* APIs

ROCm backend details

Full support through HIP-backed PyTorch CUDA APIs:

Same API surface as CUDA backend
Automatic detection via detect_torch_runtime_backend()
AMD GPU memory management through ROCm

Implementation: Uses HIP-backed torch.cuda.memory_* APIs

MPS backend details

Partial support for Apple Silicon:

Memory allocation tracking
Limited device total/free memory
Depends on torch.mps.recommended_max_memory() availability

Implementation: Uses torch.mps APIs when available

CPU fallback details

CPU-only mode when no GPU is available:

Tracks system memory usage
No device memory queries
Full profiling and analysis features still available

Implementation: Uses psutil for memory monitoring

TensorFlow (tfmemprof)

Runtime backend	Typical platform	Telemetry collector	Notes
`cuda`	NVIDIA + CUDA	`tfmemprof.memory_tracker`	Build/runtime diagnostics shown in `tfmemprof info`
`rocm`	AMD + ROCm (Linux)	`tfmemprof.memory_tracker`	Build/runtime diagnostics shown in `tfmemprof info`
`metal`	Apple Silicon	`tfmemprof.memory_tracker`	Counters can be runtime-dependent on Metal stack
`cpu`	Any host	`tfmemprof.memory_tracker`	Full CLI surface remains available

CUDA backend details

Full support for CUDA devices:

GPU memory profiling
Session-based tracking
Keras model profiling

Check build info: Run tfmemprof info to see CUDA build configuration

ROCm backend details

Full support for AMD GPUs:

ROCm-enabled TensorFlow builds
Same profiling capabilities as CUDA

Check build info: Run tfmemprof info to see ROCm build configuration

Metal backend details

Apple Silicon support:

Native Metal GPU acceleration
Memory tracking dependent on Metal stack
Use tensorflow-metal for GPU acceleration

Installation:

pip install tensorflow tensorflow-metal

CPU fallback details

CPU-only mode:

Full CLI surface available
Memory profiling for CPU operations
Session and graph execution monitoring

Backend capability metadata

Tracker exports may include backend capability hints under metadata:

{
  "backend": "cuda",
  "supports_device_total": true,
  "supports_device_free": true,
  "sampling_source": "torch.cuda.memory_stats"
}

Metadata fields

Field	Type	Description
`backend`	string	Backend type: `"cuda"`, `"rocm"`, `"mps"`, `"cpu"`, `"metal"`
`supports_device_total`	boolean	Whether total device memory query is supported
`supports_device_free`	boolean	Whether free device memory query is supported
`sampling_source`	string	Source of memory samples (e.g., `torch.cuda.memory_stats`)

Validation notes

Documentation linkage

The compatibility matrix is linked from README.md and docs/index.md

CLI examples validation

CLI examples smoke validation is part of CI (examples.cli.quickstart) in .github/workflows/ci.yml

Backend metadata

Backend capability metadata is emitted in tracker exports for analysis and debugging

Platform-specific considerations

Linux

Best support for CUDA and ROCm backends

Full feature availability

Recommended for production use

macOS

MPS support on Apple Silicon

Metal support for TensorFlow

CUDA not available - use MPS or CPU fallback

Windows

CUDA support available

Full PyTorch and TensorFlow support

ROCm support limited on Windows

Installation examples

# Install PyTorch with CUDA
pip install torch --index-url https://download.pytorch.org/whl/cu118

# Install TensorFlow (includes GPU support)
pip install tensorflow

# Install GPU Memory Profiler
pip install gpu-memory-profiler

Get Started

Core Concepts

Guides

Examples

Advanced

Source of truth

Runtime and version support

CLI feature availability by environment

CUDA (NVIDIA)

ROCm (AMD Linux)

Apple Metal / MPS

CPU-only host

Complete CLI availability table

Backend capability details

PyTorch (gpumemprof)

TensorFlow (tfmemprof)

Backend capability metadata

Metadata fields

Validation notes

Platform-specific considerations

Linux

macOS

Windows

Installation examples

See also

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Examples

Advanced

​Source of truth

​Runtime and version support

​CLI feature availability by environment

CUDA (NVIDIA)

ROCm (AMD Linux)

Apple Metal / MPS

CPU-only host

​Complete CLI availability table

​Backend capability details

​PyTorch (gpumemprof)

​TensorFlow (tfmemprof)

​Backend capability metadata

​Metadata fields

​Validation notes

​Platform-specific considerations

​Linux

​macOS

​Windows

​Installation examples

​See also

Build docs developers (and LLMs) love

Source of truth

Runtime and version support

CLI feature availability by environment

Complete CLI availability table

Backend capability details

PyTorch (gpumemprof)

TensorFlow (tfmemprof)

Backend capability metadata

Metadata fields

Validation notes

Platform-specific considerations

Linux

macOS

Windows

Installation examples

See also