Docker is recommended for full capabilities on Windows, including vision, audio, and image generation. See the Docker install guide.
Prerequisites
Install the following tools before running the h2oGPT installer.Install Visual Studio 2022
Download Visual Studio 2022 Community and run the installer.
- Click Individual Components.
- Search for and select each of the following:
Windows 11 SDK(e.g. 10.0.22000.0)C++ Universal Windows Platform support(for v143 build tools)MSVC VS 2022 C++ x64/x86 build tools(latest)C++ CMake tools for Windows
- Click Install. You do not need to launch Visual Studio when installation completes.
Install MinGW
Download the MinGW installer and run it.
- Select the following packages:
mingw32-basemingw32-gcc-g++
- Go to Installation → Apply Changes.
Install Miniconda
Download and install Miniconda for Windows.
Install h2oGPT
Open the Miniconda shell (not PowerShell) as Administrator.Set llama_cpp_python build flags (GPU only)
Building with all CUDA architectures takes time but is required — llama_cpp_python fails if
all is omitted from CMAKE_CUDA_ARCHITECTURES.Notes
- Models are cached in
C:\Users\<user>\.cache\(huggingface, chroma, torch, etc.). - For Windows paths in h2oGPT arguments, use absolute paths:
--user_path=C:\Users\YourUsername\h2ogpt. - Use
set CUDA_VISIBLE_DEVICES=0to pin inference to a specific GPU. - If the model is using the GPU,
python.exeappears in Compute (C) mode innvidia-smi.
Troubleshooting
SSL certificate failure when connecting to Hugging Face Your organization may be blocking HuggingFace. Try using a proxy or see the HuggingFace transformers issue tracker. Import errors at startup SetPYTHONPATH in your launch script:
Environment variables
You can override anygenerate.py CLI argument by setting an environment variable prefixed with h2ogpt_:
n_jobs— number of cores for various tasksOMP_NUM_THREADS— thread count for LLaMaCUDA_VISIBLE_DEVICES— which GPUs to use (e.g.set CUDA_VISIBLE_DEVICES=0)h2ogpt_server_name— set to your LAN IP address (e.g.192.168.1.172) to expose h2oGPT on your local network