Running ComfyUI with Arguments
Pass arguments when launching ComfyUI:Server Configuration
Specify the IP address to listen on. Use
0.0.0.0 to listen on all IPv4 addresses, or provide a comma-separated list.Using
--listen without an argument defaults to 0.0.0.0,:: (all IPv4 and IPv6 addresses).Set the listen port for the web server.
Path to TLS (SSL) key file. Enables HTTPS access. Requires
--tls-certfile.Path to TLS (SSL) certificate file. Enables HTTPS access. Requires
--tls-keyfile.Enable CORS (Cross-Origin Resource Sharing) with optional origin.
Set the maximum upload size in MB.
Automatically launch ComfyUI in the default browser on startup.
Disable auto-launching the browser (overrides
--auto-launch).Suppress server output messages.
Enable response body compression to reduce bandwidth usage.
Directory Configuration
Set the ComfyUI base directory for models, custom_nodes, input, output, temp, and user directories.
Set the output directory for generated images. Overrides
--base-directory.Set the input directory for source images. Overrides
--base-directory.Set the temporary files directory. Overrides
--base-directory.Set the user directory with an absolute path. Overrides
--base-directory.Load one or more
extra_model_paths.yaml files. Can be specified multiple times.GPU and Device Configuration
Set the CUDA device ID to use. Makes all other devices invisible.
Set the default device ID while keeping other devices visible.
Use torch-directml for AMD/Intel GPUs on Windows.
Sets the oneAPI device selector for Intel GPUs.
Disables ipex.optimize when loading models with Intel’s Extension for PyTorch.
Force ComfyUI to act as if the device supports FP8 compute.
VRAM Management
Store and run everything (text encoders, CLIP, VAE, etc.) on the GPU. Requires high VRAM.
Keep models in GPU memory after use instead of unloading to CPU. Requires ~12GB+ VRAM.
Force normal VRAM usage mode (used if lowvram is automatically enabled).
Split the UNet model to use less VRAM. For GPUs with 4-8GB VRAM.
Extreme memory saving mode. Use when lowvram isn’t enough.
Run everything on the CPU. Very slow but works without a GPU.
Reserve amount of VRAM (in GB) for OS/other software.
Force aggressive offloading to RAM instead of keeping models in VRAM when possible.
Disable dynamic VRAM management and use estimate-based model loading.
Use async weight offloading with specified number of streams. Enabled by default on NVIDIA.
Disable async weight offloading.
Precision and Data Types
Global Precision
Force FP32 precision globally. Use if your GPU has issues with FP16.
Force FP16 precision globally.
UNet/Diffusion Model Precision
Run the diffusion model in FP32 precision.
Run the diffusion model in FP16 precision.
Run the diffusion model in BF16 precision.
Run the diffusion model in FP64 precision (very slow, for debugging).
Store UNet weights in FP8 E4M3FN format.
Store UNet weights in FP8 E5M2 format.
Store UNet weights in FP8 E8M0FNU format.
VAE Precision
Run the VAE in FP16. May cause black images on some GPUs.
Run the VAE in full precision FP32. Use if you get black images.
Run the VAE in BF16 precision.
Run the VAE on the CPU instead of GPU. Slower but uses less VRAM.
Text Encoder Precision
Store text encoder weights in FP8 (E4M3FN variant).
Store text encoder weights in FP8 (E5M2 variant).
Store text encoder weights in FP16.
Store text encoder weights in FP32.
Store text encoder weights in BF16.
CUDA Configuration
Enable cudaMallocAsync (enabled by default for PyTorch 2.0+).
Disable cudaMallocAsync. Use if you get CUDA errors.
Force channels-last memory format when inferencing models.
Force non-blocking operations for all tensors. May improve performance on non-NVIDIA systems.
Disable pinned memory usage.
Attention Mechanisms
Use split cross attention optimization. Ignored when xformers is used.
Use sub-quadratic cross attention optimization. Ignored when xformers is used.
Use PyTorch 2.0+ native cross attention function.
Use Sage Attention implementation.
Use FlashAttention for faster attention computation.
Disable xformers optimization.
Force enable attention upcasting. Use if you get black images.
Disable all attention upcasting. For debugging only.
Caching Strategies
Use the old-style aggressive caching (default behavior).
Use LRU caching with maximum N node results cached. May use more RAM/VRAM.
Disable caching to reduce RAM/VRAM usage. Executes every node for each run.
Use RAM pressure caching with specified headroom threshold (in GB). Removes large items when RAM is low.
Preview Configuration
Default preview method for sampler nodes. Options:
none, auto, latent2rgb, taesd.Set the maximum preview size for sampler nodes.
File Loading Optimizations
Use memory mapping when loading .ckpt/.pt files.
Don’t use memory mapping when loading safetensors files.
Hash function for duplicate filename/contents comparison. Options:
md5, sha1, sha256, sha512.Performance Optimizations
Enable experimental optimizations that may reduce quality. Use without arguments to enable all, or specify individual features.Available optimizations:
fp16_accumulation- Use FP16 accumulationfp8_matrix_mult- Use FP8 matrix multiplicationcublas_ops- Use cuBLAS operationsautotune- Enable autotuning
Make PyTorch use slower deterministic algorithms when possible.
This may not make images fully deterministic in all cases.
Custom Nodes
Disable loading all custom nodes.
Specify custom node folders to load even when
--disable-all-custom-nodes is enabled.Disable loading all API nodes. Also prevents the frontend from communicating with the internet.
ComfyUI Manager
Enable the ComfyUI-Manager feature.
Disable only the ComfyUI-Manager UI and endpoints. Background tasks still operate. Requires
--enable-manager.Enable the legacy UI of ComfyUI-Manager. Requires
--enable-manager.Frontend Configuration
Specify the frontend version to use. Format:
[owner]/[repo]@[version]Local filesystem path to frontend directory. Overrides
--front-end-version.Multi-User and Database
Enable per-user storage.
Specify the database URL.
Disable asset scanning on startup for database synchronization.
Set the base URL for the ComfyUI API.
Metadata and Output
Disable saving prompt metadata in output files.
Logging
Set the logging level. Options:
DEBUG, INFO, WARNING, ERROR, CRITICAL.Send process output to stdout instead of stderr.
Development and Testing
Quick test mode for continuous integration.
Enable features for Windows standalone build (auto-launch, etc.).