ComfyUI includes a sophisticated memory management system that automatically handles VRAM allocation, model loading/unloading, and memory optimization across different hardware configurations.
ComfyUI adapts its behavior based on available VRAM through the VRAMState enum:
class VRAMState(Enum): DISABLED = 0 # No VRAM present: no need to move models to VRAM NO_VRAM = 1 # Very low VRAM: enable all options to save VRAM LOW_VRAM = 2 # Limited VRAM: selective model loading NORMAL_VRAM = 3 # Standard operation HIGH_VRAM = 4 # Keep models in VRAM SHARED = 5 # Shared CPU/GPU memory (e.g., integrated graphics)
from comfy import model_management# Get total memorytotal_vram = model_management.get_total_memory()print(f"Total VRAM: {total_vram / (1024**3):.2f} GB")# Get free memoryfree_vram = model_management.get_free_memory()print(f"Free VRAM: {free_vram / (1024**3):.2f} GB")# Get both hardware and PyTorch memorymem_total, mem_torch = model_management.get_free_memory( device, torch_free_too=True)
The memory reporting differs between hardware backends. CUDA uses torch.cuda.mem_get_info(), while XPU and NPU use their respective APIs.
The free_memory() function intelligently unloads models:
def free_memory( memory_required, device, keep_loaded=[], for_dynamic=False, ram_required=0): # 1. Garbage collect cleanup_models_gc() # 2. Find candidate models to unload can_unload = [] for shift_model in current_loaded_models: if shift_model not in keep_loaded: can_unload.append(shift_model) # 3. Sort by offloaded memory (unload least active first) for model in sorted(can_unload): if get_free_memory(device) < memory_required: model.model_unload() # 4. Clear cache soft_empty_cache()
ComfyUI supports asynchronous weight offloading with CUDA/XPU streams:
NUM_STREAMS = 2 # Default for NVIDIA and AMDif args.async_offload is not None: NUM_STREAMS = args.async_offloaddef get_offload_stream(device): if NUM_STREAMS == 0: return None if device in STREAMS: ss = STREAMS[device] ss[stream_counter].wait_stream(current_stream(device)) stream_counter = (stream_counter + 1) % len(ss) return ss[stream_counter]
ComfyUI monitors for memory leaks and triggers garbage collection:
def cleanup_models_gc(): do_gc = False reset_cast_buffers() for cur in current_loaded_models: if cur.is_dead(): logging.info( f"Potential memory leak detected with model {cur.real_model().__class__.__name__}" ) do_gc = True break if do_gc: gc.collect() soft_empty_cache()
If you see memory leak warnings, check for circular references in your custom nodes or models.