GBM & XGBoost

Both GBM and XGBoost are forward-learning ensemble methods that build regression trees sequentially, with each new tree correcting for the errors of the previous ones. They are among the most accurate algorithms on tabular data and are the typical winners of ML competitions.

Gradient Boosting Machine (GBM)

H2O’s GBM sequentially builds regression trees on all features of the dataset in a fully distributed way — each tree is built in parallel across the cluster. It supports per-row observation weights, offsets, N-fold cross-validation, and a wide range of distribution functions. MOJO Support: GBM fully supports importing and exporting MOJOs for low-latency production scoring.

XGBoost

H2O’s XGBoost implementation is based on the native XGBoost library via JNI. It provides parallel tree boosting (GBDT/GBM) and is often faster than H2O GBM on large datasets, especially with GPU acceleration. It supports multicore via OpenMP and can use GPU backends (backend="gpu"). MOJO Support: XGBoost supports importing and exporting MOJOs.

Key Parameters

Shared Parameters (GBM & XGBoost)

ntrees

int

default:"50"

Number of trees to build. More trees generally improve accuracy but increase training time and risk of overfitting. Use early stopping (stopping_rounds) to find the optimal value automatically.

max_depth

int

default:"5 (GBM), 6 (XGBoost)"

Maximum depth of each tree. Higher values increase model complexity. For GBM, 5 is a good default; XGBoost often benefits from shallower trees (3–6) when combined with more rounds.

learn_rate

float

default:"0.1 (GBM), 0.3 (XGBoost)"

Step size shrinkage applied after each tree. Smaller values (e.g., 0.01–0.05) require more trees but often yield better generalization. Alias: eta in XGBoost.

sample_rate

float

default:"1.0 (GBM), 1.0 (XGBoost)"

Row-wise subsampling rate per tree (without replacement). Values in the range 0.5–0.8 add stochasticity and can improve generalization (Friedman 1999 stochastic GBM). Alias: subsample in XGBoost.

col_sample_rate

float

default:"1.0"

Column subsampling rate per tree level (without replacement). Alias: colsample_bylevel in XGBoost.

col_sample_rate_per_tree

float

default:"1.0"

Column subsampling rate per tree (without replacement). Multiplicative with col_sample_rate. Alias: colsample_bytree in XGBoost.

min_rows

float

default:"10.0 (GBM), 1.0 (XGBoost)"

Minimum number of observations required in a leaf node. Increase to prevent overfitting on small datasets. R parameter name: node_size.

distribution

str

default:"auto"

Loss function / response distribution. "auto" infers from the response column type. Options: "gaussian", "bernoulli", "multinomial", "poisson", "gamma", "tweedie", "laplace", "quantile", "huber".

GBM-Specific Parameters

learn_rate_annealing

float

default:"1.0"

Reduce learn_rate by this factor after every tree. E.g., learn_rate=0.05, learn_rate_annealing=0.99 gives a decaying learning rate that converges faster than a fixed small rate.

histogram_type

str

default:"AUTO"

How to bin continuous features for split-finding. Options: "AUTO", "UniformAdaptive", "UniformRobust", "Random" (XRT-style), "QuantilesGlobal", "RoundRobin".

XGBoost-Specific Parameters

booster

str

default:"gbtree"

Booster type: "gbtree" (tree-based), "gblinear" (linear), or "dart" (Dropout Additive Regression Trees).

backend

str

default:"auto"

Compute backend: "auto" uses a GPU if available, otherwise CPU. Set "gpu" to force GPU, "cpu" to force CPU.

reg_lambda

float

default:"1.0"

L2 regularization on leaf weights. Larger values reduce overfitting.

reg_alpha

float

default:"0.0"

L1 regularization on leaf weights. Promotes sparsity.

tree_method

str

default:"auto"

Tree construction algorithm: "auto", "exact" (small/medium data), "approx", or "hist" (fast histogram method, required for GPU).

Code Examples

import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator

h2o.init()

train = h2o.import_file("https://s3.amazonaws.com/h2o-public-test-data/smalldata/higgs/higgs_train_5k.csv")
test  = h2o.import_file("https://s3.amazonaws.com/h2o-public-test-data/smalldata/higgs/higgs_test_5k.csv")

y = "response"
x = train.columns
x.remove(y)
train[y] = train[y].asfactor()
test[y]  = test[y].asfactor()

gbm = H2OGradientBoostingEstimator(
    ntrees=100,
    max_depth=5,
    learn_rate=0.05,
    sample_rate=0.8,
    col_sample_rate=0.8,
    stopping_rounds=5,
    stopping_metric="AUC",
    seed=42
)
gbm.train(x=x, y=y, training_frame=train)
print(gbm.auc(valid=True))

GBM vs XGBoost: When to Use Which

Criterion	H2O GBM	XGBoost
Default `learn_rate`	0.1	0.3
Default `max_depth`	5	6
Default `sample_rate`	1.0	1.0
GPU acceleration	No	Yes (`backend="gpu"`)
Distribution functions	More options (Huber, custom)	Most standard options
Categorical encoding	Native enum (no one-hot required)	Requires encoding
Custom distribution	Yes (`custom_distribution_func`)	No
Typical use case	General tabular; insurance, credit	Large datasets; competition-level accuracy

A common strategy is to run both GBM and XGBoost in AutoML and use the leaderboard to decide. When training manually, try XGBoost first on datasets with > 500k rows and GPU hardware; otherwise GBM’s native categorical handling often gives an edge.

GPU Acceleration for XGBoost

XGBoost in H2O-3 supports GPU-accelerated training via the hist tree method:

Python

xgb_gpu = H2OXGBoostEstimator(
    ntrees=500,
    max_depth=8,
    learn_rate=0.05,
    backend="gpu",        # use GPU
    gpu_id=0,             # which GPU (default 0)
    tree_method="hist",   # required for GPU
    seed=42
)
xgb_gpu.train(x=x, y=y, training_frame=train)

H2O-3 automatically loads the most capable XGBoost native library available (GPU+OMP > OMP > single-CPU fallback). Set backend="auto" to let H2O choose. GPU training requires tree_method="hist" or tree_method="auto".

Get Started

Core Concepts

Algorithms

Model Workflows

Deployment

GBM & XGBoost

Gradient Boosting Machine (GBM)

XGBoost

Key Parameters

Shared Parameters (GBM & XGBoost)

GBM-Specific Parameters

XGBoost-Specific Parameters

Code Examples

GBM vs XGBoost: When to Use Which

GPU Acceleration for XGBoost

Build docs developers (and LLMs) love

Get Started

Core Concepts

Algorithms

Model Workflows

Deployment

​Gradient Boosting Machine (GBM)

​XGBoost

​Key Parameters

​Shared Parameters (GBM & XGBoost)

​GBM-Specific Parameters

​XGBoost-Specific Parameters

​Code Examples

​GBM vs XGBoost: When to Use Which

​GPU Acceleration for XGBoost

Build docs developers (and LLMs) love

Gradient Boosting Machine (GBM)

XGBoost

Key Parameters

Shared Parameters (GBM & XGBoost)

GBM-Specific Parameters

XGBoost-Specific Parameters

Code Examples

GBM vs XGBoost: When to Use Which

GPU Acceleration for XGBoost