Skip to main content
H2O-3 is an open-source, distributed machine learning platform that makes it easy to train, tune, and deploy models at scale. It supports Python, R, Java, Scala, and a web-based Flow UI — and runs seamlessly on Hadoop, Spark, Kubernetes, and standalone clusters.

Quickstart

Get a model trained in under 5 minutes with Python or R

Installation

Install H2O-3 via pip, CRAN, or download the standalone jar

AutoML

Automatically train and rank hundreds of models with a single call

Algorithm Reference

GBM, XGBoost, Random Forest, Deep Learning, GLM, GAM, and more

What you can do with H2O-3

AutoML

Automatically trains and ranks models — no ML expertise required

Distributed Training

Scale across a cluster using in-memory distributed computing

Model Explainability

SHAP values, PDP plots, variable importance — built in

MOJO Deployment

Export models as MOJO or POJO for fast, dependency-free scoring

Flow Web UI

Interactive notebook-style UI for data exploration and modeling

Multi-language API

Consistent API across Python, R, and REST

Get up and running

1

Install H2O-3

Install the Python package from PyPI or the R package from CRAN.
pip install h2o
2

Start the H2O cluster

Initialize H2O — this starts a local cluster in-process.
import h2o
h2o.init()
3

Load data and train a model

Import a dataset and train your first model.
df = h2o.import_file("https://s3.amazonaws.com/h2o-public-test-data/smalldata/iris/iris_wheader.csv")
from h2o.estimators import H2OGradientBoostingEstimator
model = H2OGradientBoostingEstimator()
model.train(y="class", training_frame=df)
4

Make predictions

Score new data and export your model for production.
predictions = model.predict(df)
model.download_mojo(path="./")

Supported algorithms

H2O-3 includes production-ready implementations of the most widely used ML algorithms:
AlgorithmBest for
GBM / XGBoostTabular regression & classification
Random ForestRobust baseline, interpretable
Deep LearningComplex patterns, multi-layer networks
GLM / GAMInterpretable linear & additive models
AutoMLAutomatic model selection & tuning
Stacked EnsemblesCombining multiple models
K-Means / PCAClustering & dimensionality reduction
H2O-3 is Apache 2.0 licensed. Source code, issue tracking, and community discussion are on GitHub.

Build docs developers (and LLMs) love