Linux

These instructions are for Ubuntu x86_64. Substitute apt-get with the appropriate package manager for other distributions.

Quick install (script)

Install the CUDA toolkit (GPU only)

Skip this step for CPU-only inference.

wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda_12.1.1_530.30.02_linux.run
sudo sh cuda_12.1.1_530.30.02_linux.run

Install only the toolkit — do not overwrite the existing driver or /usr/local/cuda symlink when prompted.

Run the installation script

curl -fsSL https://h2o-release.s3.amazonaws.com/h2ogpt/linux_install_full.sh | bash

Enter your sudo password when prompted. To avoid repeated password prompts, extend the sudo timeout first:

sudo visudo
# Add after the "Defaults env_reset" line:
# Defaults        timestamp_timeout=60
sudo bash
exit

Activate the h2oGPT environment

conda activate h2ogpt

Manual install

Set up a Python 3.10 environment with Miniconda

wget https://repo.anaconda.com/miniconda/Miniconda3-py310_23.1.0-1-Linux-x86_64.sh
bash ./Miniconda3-py310_23.1.0-1-Linux-x86_64.sh -b -p $HOME/miniconda3

echo '### Conda init ###' >> $HOME/.bashrc
echo 'source $HOME/miniconda3/etc/profile.d/conda.sh' >> $HOME/.bashrc
echo 'conda activate' >> $HOME/.bashrc
source $HOME/.bashrc

conda update conda -y
conda create -n h2ogpt -y
conda activate h2ogpt
conda install python=3.10 -c conda-forge -y

Verify the Python version:

python --version
python -c "import os, sys ; print('hello world')"

The output should show 3.10.xx and print hello world.

Clone h2oGPT

git clone https://github.com/h2oai/h2ogpt.git
cd h2ogpt

Install CUDA toolkit and set environment variables (GPU only)

Skip this step for CPU inference.

wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda_12.1.1_530.30.02_linux.run
sudo sh cuda_12.1.1_530.30.02_linux.run

Add CUDA to your environment:

echo 'export CUDA_HOME=/usr/local/cuda-12.1' >> $HOME/.bashrc
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib64:$CUDA_HOME/extras/CUPTI/lib64' >> $HOME/.bashrc
echo 'export PATH=$PATH:$CUDA_HOME/bin' >> $HOME/.bashrc
source $HOME/.bashrc

Set the PyTorch index URL

CUDA 12.1
CUDA 11.8
CPU only

export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cu121 https://huggingface.github.io/autogptq-index/whl/cu121"

export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cu118 https://huggingface.github.io/autogptq-index/whl/cu118"

Choose cu118 or higher for A100/H100 GPUs.

export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cpu"

Set llama_cpp_python build flags (GPU only)

export GGML_CUDA=1
export CMAKE_ARGS="-DGGML_CUDA=on -DCMAKE_CUDA_ARCHITECTURES=all"
export FORCE_CMAKE=1

Building with all CUDA architectures takes time but is required — llama_cpp_python fails if all is omitted from CMAKE_CUDA_ARCHITECTURES.

Run the installation script

bash docs/linux_install.sh

To include GPL-licensed packages (PyMuPDF, etc.):

GPLOK=1 bash docs/linux_install.sh

You can comment out optional sections in the script to skip packages you do not need.

Run h2oGPT

Verify that CUDA is visible to PyTorch (GPU only):

import torch
print(torch.cuda.is_available())  # should print True

Place documents for Q&A in a user_path directory, then start h2oGPT:

python generate.py \
  --base_model=h2oai/h2ogpt-4096-llama2-13b-chat \
  --load_8bit=True \
  --score_model=None \
  --langchain_mode='UserData' \
  --user_path=user_path

Models are cached in ~/.cache/ (huggingface, chroma, torch, etc.). Open http://localhost:7860 after the server starts.

Add --share=True to expose a public Gradio URL for remote access.

Troubleshooting

undefined symbol error with flash_attn Ensure CUDA_HOME matches the toolkit version used to build h2oGPT, then reinstall:

export CUDA_HOME=/usr/local/cuda-12.1
pip uninstall flash_attn autoawq autoawq-kernels -y
pip install flash_attn autoawq autoawq-kernels --no-cache-dir

protobuf import error

pip install protobuf==3.20.0

Ubuntu 18 (very out of date)

Only run the commands below on Ubuntu 18. Do not run them on Ubuntu 20 or 22.

apt-get clean all
apt-get update
apt-get -y full-upgrade
apt-get -y dist-upgrade
apt-get -y autoremove
apt-get clean all

Get Started

Core Features

Models & Backends

Advanced Usage

Help

Quick install (script)

Manual install

Run h2oGPT

Troubleshooting

Build docs developers (and LLMs) love

Get Started

Core Features

Models & Backends

Advanced Usage

Help

​Quick install (script)

​Manual install

​Run h2oGPT

​Troubleshooting

Build docs developers (and LLMs) love

Quick install (script)

Manual install

Run h2oGPT

Troubleshooting