ComfyUI can be run as a headless server for production environments where the web interface is not needed. This guide covers the command-line options and configurations for running ComfyUI in headless mode.
Basic Headless Setup
To run ComfyUI as a headless server:
python main.py --listen 0.0.0.0 --port 8188
By default, ComfyUI only listens on 127.0.0.1 (localhost). Use --listen 0.0.0.0 to accept external connections.
Essential Command-Line Options
Network Configuration
--listen
string
default: "127.0.0.1"
IP address to listen on. Use 0.0.0.0 for all IPv4 interfaces or 0.0.0.0,:: for both IPv4 and IPv6. # Listen on all interfaces
python main.py --listen 0.0.0.0
# Listen on specific IP
python main.py --listen 192.168.1.100
# Listen on multiple IPs
python main.py --listen 127.0.0.1,192.168.1.100
Set the listen port. python main.py --port 8080
Suppress server output logs. python main.py --dont-print-server
Security Options
Path to TLS key file for HTTPS support. python main.py --tls-keyfile key.pem --tls-certfile cert.pem
Path to TLS certificate file for HTTPS support.
Enable CORS with optional origin. Use * to allow all origins. # Allow all origins
python main.py --enable-cors-header
# Allow specific origin
python main.py --enable-cors-header https://example.com
Maximum upload size in MB. python main.py --max-upload-size 500
Directory Configuration
Set the ComfyUI base directory for all subdirectories (models, custom_nodes, input, output, temp, user). python main.py --base-directory /data/comfyui
Set the output directory. Overrides --base-directory. python main.py --output-directory /mnt/outputs
Set the input directory. Overrides --base-directory.
Set the temp directory. Overrides --base-directory.
Set the user directory. Overrides --base-directory.
Load additional model path configuration files. python main.py --extra-model-paths-config /path/to/config.yaml
VRAM Management
GPU Only
High VRAM
Normal VRAM
Low VRAM
No VRAM
CPU Only
# Keep everything on GPU
python main.py --gpu-only
Reserve VRAM in GB for OS/other software. python main.py --reserve-vram 2.0
GPU Selection
Set CUDA device ID. Other devices will be hidden. python main.py --cuda-device 0
Set default device ID while keeping other devices visible.
Precision Options
Force FP32
Force FP16
FP16 VAE
FP32 VAE
BF16 UNet
python main.py --force-fp32
Attention Mechanisms
PyTorch Attention
Split Attention
Quad Attention
Sage Attention
Flash Attention
python main.py --use-pytorch-cross-attention
Caching Options
Classic Cache
LRU Cache
RAM Pressure
No Cache
python main.py --cache-classic
Old style aggressive caching (default behavior). python main.py --cache-lru 10
Use LRU caching with maximum N node results cached. May use more RAM/VRAM. python main.py --cache-ram 4.0
Use RAM pressure caching with 4GB headroom threshold. python main.py --cache-none
Reduced RAM/VRAM usage at the expense of executing every node each run.
Custom Nodes and Extensions
--disable-all-custom-nodes
Disable loading all custom nodes. python main.py --disable-all-custom-nodes
Specify custom node folders to load even when --disable-all-custom-nodes is enabled. python main.py --disable-all-custom-nodes --whitelist-custom-nodes node1 node2
Disable API nodes that communicate with external services. python main.py --disable-api-nodes
Logging and Debugging
Set logging level: DEBUG, INFO, WARNING, ERROR, or CRITICAL. python main.py --verbose DEBUG
Send process output to stdout instead of stderr. python main.py --log-stdout
Multi-User Mode
Enable per-user storage for multi-tenant environments. python main.py --multi-user
Preview Options
Default preview method: none, auto, latent2rgb, or taesd. python main.py --preview-method taesd
Maximum preview size for sampler nodes. python main.py --preview-size 1024
Production Example
Here’s a complete example for a production headless server:
#!/bin/bash
python main.py \
--listen 0.0.0.0 \
--port 8188 \
--output-directory /data/outputs \
--temp-directory /data/temp \
--extra-model-paths-config /etc/comfyui/models.yaml \
--highvram \
--preview-method taesd \
--disable-all-custom-nodes \
--whitelist-custom-nodes essential-nodes \
--verbose INFO \
--enable-cors-header \
--cache-lru 5
Always configure proper firewall rules when running with --listen 0.0.0.0 to prevent unauthorized access.
AMD GPUs (ROCm)
# Linux with ROCm
TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL = 1 python main.py --use-pytorch-cross-attention
# Enable tunable ops (slow first run)
PYTORCH_TUNABLEOP_ENABLED = 1 python main.py
# For unsupported cards
HSA_OVERRIDE_GFX_VERSION = 10.3.0 python main.py # RDNA2 (6700, 6600)
HSA_OVERRIDE_GFX_VERSION = 11.0.0 python main.py # RDNA3 (7600)
Intel GPUs
python main.py --oneapi-device-selector level_zero:0
DirectML (Windows)
python main.py --directml 0
Memory Optimizations
Dynamic VRAM # Enabled by default on NVIDIA
python main.py
# Force disable
python main.py --disable-dynamic-vram
Async Offload # Use async weight offloading (2 streams)
python main.py --async-offload
# Custom stream count
python main.py --async-offload 4
# Disable
python main.py --disable-async-offload
Smart Memory # Disable smart memory (aggressive offload)
python main.py --disable-smart-memory
Pinned Memory # Disable pinned memory
python main.py --disable-pinned-memory
Next Steps
Python Integration Learn how to integrate ComfyUI into Python applications
Docker Deployment Deploy ComfyUI using containers