Skip to main content
These instructions cover a full installation of h2oGPT on Windows 10 or 11, including optional GPU (CUDA) support. The installer script takes approximately 9 GB of disk space if no optional packages are skipped.
Docker is recommended for full capabilities on Windows, including vision, audio, and image generation. See the Docker install guide.

Prerequisites

Install the following tools before running the h2oGPT installer.
1

Install Visual Studio 2022

Download Visual Studio 2022 Community and run the installer.
  1. Click Individual Components.
  2. Search for and select each of the following:
    • Windows 11 SDK (e.g. 10.0.22000.0)
    • C++ Universal Windows Platform support (for v143 build tools)
    • MSVC VS 2022 C++ x64/x86 build tools (latest)
    • C++ CMake tools for Windows
  3. Click Install. You do not need to launch Visual Studio when installation completes.
2

Install MinGW

Download the MinGW installer and run it.
  1. Select the following packages:
    • mingw32-base
    • mingw32-gcc-g++
  2. Go to Installation → Apply Changes.
3

Install Miniconda

Download and install Miniconda for Windows.
4

Install the latest NVIDIA driver (GPU only)

Download and install the latest NVIDIA driver for Windows. Confirm it works:
nvidia-smi

Install h2oGPT

Open the Miniconda shell (not PowerShell) as Administrator.
1

Add MinGW to your path

set path=%path%;c:\MinGW\msys\1.0\bin\
On some systems the correct path is c:\MinGW\bin\ instead.
2

Create the conda environment

conda create -n h2ogpt -y
conda activate h2ogpt
conda install python=3.10 -c conda-forge -y
python --version
python -c "import os, sys ; print('hello world')"
The output should show 3.10.xx and print hello world.
3

Install CUDA via conda (GPU only)

conda install cudatoolkit=11.8 -c conda-forge -y
set CUDA_HOME=%CONDA_PREFIX%
4

Install Git and clone h2oGPT

conda install -c conda-forge git
git clone https://github.com/h2oai/h2ogpt.git
cd h2ogpt
5

Set the PyTorch index URL

set PIP_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cu118 https://huggingface.github.io/autogptq-index/whl/cu118/
6

Set llama_cpp_python build flags (GPU only)

set CMAKE_ARGS=-DGGML_CUDA=on -DCMAKE_CUDA_ARCHITECTURES=all
set GGML_CUDA=1
set FORCE_CMAKE=1
Building with all CUDA architectures takes time but is required — llama_cpp_python fails if all is omitted from CMAKE_CUDA_ARCHITECTURES.
7

Run the installer script

docs\windows_install.bat
To include GPL-licensed packages (PyMuPDF, etc.):
set GPLOK=1
docs\windows_install.bat
The script also contains instructions for enabling Microsoft Word/Excel support and Tesseract OCR. Open docs\windows_install.bat to review or customize the installation.

Notes

  • Models are cached in C:\Users\<user>\.cache\ (huggingface, chroma, torch, etc.).
  • For Windows paths in h2oGPT arguments, use absolute paths: --user_path=C:\Users\YourUsername\h2ogpt.
  • Use set CUDA_VISIBLE_DEVICES=0 to pin inference to a specific GPU.
  • If the model is using the GPU, python.exe appears in Compute (C) mode in nvidia-smi.

Troubleshooting

SSL certificate failure when connecting to Hugging Face Your organization may be blocking HuggingFace. Try using a proxy or see the HuggingFace transformers issue tracker. Import errors at startup Set PYTHONPATH in your launch script:
SET PYTHONPATH=.:src:%PYTHONPATH%
python generate.py ...
Using bash on Windows For easier command-line operations, install Git for Windows which includes bash and coreutils.

Environment variables

You can override any generate.py CLI argument by setting an environment variable prefixed with h2ogpt_:
  • n_jobs — number of cores for various tasks
  • OMP_NUM_THREADS — thread count for LLaMa
  • CUDA_VISIBLE_DEVICES — which GPUs to use (e.g. set CUDA_VISIBLE_DEVICES=0)
  • h2ogpt_server_name — set to your LAN IP address (e.g. 192.168.1.172) to expose h2oGPT on your local network
To shut down h2oGPT, go to the System tab in the UI, click Admin, then click Shutdown h2oGPT.

Build docs developers (and LLMs) love