Quick start

Prerequisites

Before you begin, make sure you have:

Python 3.10 or higher
pip package manager
(Optional) CUDA-enabled GPU for GPU profiling
(Optional) PyTorch 1.8+ or TensorFlow 2.4+

GPU Memory Profiler works on CPU-only systems too! It automatically falls back to CPU memory tracking when CUDA isn’t available.

Installation

Install GPU Memory Profiler with your preferred framework support:

pip install gpu-memory-profiler[torch]

For visualization support, add the viz extra:

pip install gpu-memory-profiler[torch,viz]

PyTorch quick start

Import the profiler

from gpumemprof import GPUMemoryProfiler
import torch
import torch.nn as nn

# Initialize the profiler
profiler = GPUMemoryProfiler(track_tensors=True)

Profile your training step

Create a simple model and profile its training:

# Define a simple model
class SimpleModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 10)
    
    def forward(self, x):
        x = torch.relu(self.fc1(x))
        return self.fc2(x)

model = SimpleModel().cuda()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.CrossEntropyLoss()

# Profile a training step
def train_step(model, data, target):
    optimizer.zero_grad()
    output = model(data)
    loss = criterion(output, target)
    loss.backward()
    optimizer.step()
    return loss.item()

# Create sample data
data = torch.randn(32, 784).cuda()
target = torch.randint(0, 10, (32,)).cuda()

# Profile the function
profile = profiler.profile_function(train_step, model, data, target)
print(f"Function: {profile.function_name}")

Use context manager for epochs

Profile entire training loops with context managers:

for epoch in range(3):
    with profiler.profile_context(f"epoch_{epoch+1}"):
        loss = train_step(model, data, target)
        print(f"Epoch {epoch+1} loss: {loss:.4f}")

View the summary

Get detailed memory statistics:

summary = profiler.get_summary()
print(f"Peak memory: {summary['peak_memory_usage'] / (1024**3):.2f} GB")
print(f"Average memory: {summary['average_memory_usage'] / (1024**3):.2f} GB")
print(f"Total snapshots: {summary['total_snapshots']}")

TensorFlow quick start

Import the profiler

import tensorflow as tf
from tfmemprof import TFMemoryProfiler

# Initialize the profiler
profiler = TFMemoryProfiler(enable_tensor_tracking=True)

Create and profile a model

Build a simple model and profile training:

# Define a simple model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(128, activation='relu', input_shape=(784,)),
    tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Create sample data
x_train = tf.random.normal((1000, 784))
y_train = tf.random.uniform((1000,), minval=0, maxval=10, dtype=tf.int32)

Profile with context manager

# Profile the training
with profiler.profile_context("training"):
    model.fit(x_train, y_train, epochs=3, batch_size=32, verbose=0)
    print("Training complete!")

View the results

results = profiler.get_results()
print(f"Duration: {results.duration:.3f} seconds")
print(f"Peak memory: {results.peak_memory_mb:.2f} MB")
print(f"Average memory: {results.average_memory_mb:.2f} MB")
print(f"Snapshots captured: {len(results.snapshots)}")

CLI usage

GPU Memory Profiler includes powerful command-line tools for both frameworks.

PyTorch CLI
TensorFlow CLI

System information

gpumemprof info

Real-time monitoring

Monitor GPU memory usage in real-time:

# Monitor for 30 seconds with 0.5s interval
gpumemprof monitor --duration 30 --interval 0.5

# Monitor and save to file
gpumemprof monitor --duration 30 --output memory_log.csv

Diagnose issues

Run diagnostics to identify memory problems:

gpumemprof diagnose

Export data

# Export to CSV
gpumemprof monitor --duration 10 --format csv --output metrics.csv

# Export to JSON
gpumemprof monitor --duration 10 --format json --output metrics.json

System information

tfmemprof info

Monitor training

# Monitor with custom interval
tfmemprof monitor --duration 30 --interval 1.0

# Save monitoring data
tfmemprof monitor --output tf_memory.csv

Profile a script

tfmemprof profile my_training_script.py

Interactive terminal UI

Launch the interactive dashboard for real-time monitoring:

# Install TUI dependencies
pip install gpu-memory-profiler[tui]

# Launch the dashboard
gpu-profiler

The TUI provides:

Live GPU memory monitoring
PyTorch and TensorFlow quick actions
Visualizations and charts
Export functionality
CLI command execution

The terminal UI includes tabs for Overview, PyTorch, TensorFlow, Monitoring, Visualizations, and CLI actions.

Next steps

Core concepts

Learn about profilers, trackers, and context managers

API reference

Explore the complete Python API documentation

Leak detection

Detect and prevent memory leaks in your models

Visualizations

Create timeline plots, heatmaps, and dashboards

Get Started

Core Concepts

Guides

Examples

Advanced

Prerequisites

Installation

PyTorch quick start

TensorFlow quick start

CLI usage

System information

Real-time monitoring

Diagnose issues

Export data

System information

Monitor training

Profile a script

Interactive terminal UI

Next steps

Core concepts

API reference

Leak detection

Visualizations

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Examples

Advanced

​Prerequisites

​Installation

​PyTorch quick start

​TensorFlow quick start

​CLI usage

​System information

​Real-time monitoring

​Diagnose issues

​Export data

​System information

​Monitor training

​Profile a script

​Interactive terminal UI

​Next steps

Core concepts

API reference

Leak detection

Visualizations

Build docs developers (and LLMs) love

Prerequisites

Installation

PyTorch quick start

TensorFlow quick start

CLI usage

System information

Real-time monitoring

Diagnose issues

Export data

System information

Monitor training

Profile a script

Interactive terminal UI

Next steps