MemoryTracker
Real-time TensorFlow GPU memory tracker with configurable sampling and alerts.
Constructor
MemoryTracker(
sampling_interval: float = 1.0,
alert_threshold_mb: Optional[float] = None,
device: Optional[str] = None,
enable_logging: bool = True
)
Time between memory samples in seconds
Memory threshold for triggering alerts in MB
TensorFlow device to monitor (e.g., ‘/GPU:0’). Defaults to ‘/GPU:0’
Whether to log tracking events
Methods
start_tracking
Start real-time memory tracking.
def start_tracking(self) -> None
stop_tracking
Stop tracking and return results.
def stop_tracking(self) -> TrackingResult
Object containing memory usage history, timestamps, events, and alerts
get_current_memory
Get current memory usage without starting tracking.
def get_current_memory(self) -> float
Current memory usage in MB
set_alert_threshold
Update the alert threshold during tracking.
def set_alert_threshold(self, threshold_mb: float) -> None
add_alert_callback
Add callback function for memory alerts.
def add_alert_callback(self, callback: Callable[[Dict[str, Any]], None]) -> None
Function to call when alert is triggered. Receives alert dictionary with timestamp, memory_mb, and threshold_mb
Example:
def alert_handler(alert):
print(f"Alert: {alert['message']}")
print(f"Memory: {alert['memory_mb']:.2f} MB")
tracker.add_alert_callback(alert_handler)
check_alerts
Check if any alerts have been triggered recently.
def check_alerts(self) -> bool
True if alerts triggered in last 10 seconds
get_tracking_results
Get current tracking results without stopping.
def get_tracking_results(self) -> TrackingResult
TrackingResult
Results from real-time memory tracking.
List of memory samples in MB
Corresponding timestamps for each sample
Telemetry events captured during tracking
Average memory usage in MB
Properties
duration: Total tracking duration in seconds
memory_growth_rate: Memory growth rate in MB/second
MemoryWatchdog
Automatic memory management and cleanup for TensorFlow.
Constructor
MemoryWatchdog(
max_memory_mb: float = 8000,
cleanup_threshold_mb: float = 6000,
check_interval: float = 5.0
)
Maximum memory before forced cleanup
Memory threshold to trigger cleanup
Time between memory checks in seconds
Methods
start
Start memory watchdog monitoring.
stop
Stop memory watchdog.
force_cleanup
Force immediate memory cleanup.
def force_cleanup(self) -> None
add_cleanup_callback
Add custom cleanup callback function.
def add_cleanup_callback(self, callback: Callable[[], None]) -> None
Function to call during cleanup operations
Example
from tfmemprof.tracker import MemoryTracker, MemoryWatchdog
import tensorflow as tf
# Basic tracking
tracker = MemoryTracker(
sampling_interval=0.5,
alert_threshold_mb=4000
)
# Add alert callback
def on_alert(alert):
print(f"Memory alert: {alert['message']}")
tracker.add_alert_callback(on_alert)
# Start tracking
tracker.start_tracking()
# Run your code
model = tf.keras.applications.ResNet50()
model.fit(x_train, y_train, epochs=10)
# Stop and get results
results = tracker.stop_tracking()
print(f"Peak memory: {results.peak_memory:.2f} MB")
print(f"Average memory: {results.average_memory:.2f} MB")
print(f"Alerts triggered: {len(results.alerts_triggered)}")
# Memory watchdog for automatic cleanup
watchdog = MemoryWatchdog(
max_memory_mb=8000,
cleanup_threshold_mb=6000
)
# Add custom cleanup
def custom_cleanup():
print("Running custom cleanup")
# Clear caches, etc.
watchdog.add_cleanup_callback(custom_cleanup)
watchdog.start()
# Your training code
# Watchdog automatically cleans up when thresholds exceeded
watchdog.stop()