Basic tracker setup
Create a tracker with monitoring enabled:Configure leak thresholds
Set thresholds for leak detection:Alert callbacks
Register callbacks to handle memory alerts:Start tracking
Begin monitoring memory usage:Memory watchdog
Use the watchdog for automatic cleanup:- Standard cleanup: Calls
torch.cuda.empty_cache()at 75% usage - Aggressive cleanup: Forces garbage collection and cache clearing at 90%
Simulating leaky workloads
Example of tracking a workload with intentional memory leaks:Analyze tracking results
Get statistics after tracking:Memory timeline
Extract memory usage over time:Export tracking events
Export events for analysis:OOM flight recorder
Enable automatic OOM dump capture:- Captures the last N events leading up to the OOM
- Records exception details and stack traces
- Exports a diagnostic bundle for analysis
Next steps
- Export tracking data with telemetry export
- Debug OOM errors with OOM recording
- Learn about basic profiling for simpler use cases