x (predictor column names or indices), y (response column), and training_frame. Unsupervised algorithms omit y.
All estimator functions return an H2O model object. Pass it to
h2o.predict() for inference or h2o.performance() for evaluation metrics.h2o.gbm()
Gradient Boosting Machine — builds an ensemble of shallow decision trees where each tree corrects the residuals of the previous.Key parameters
Key parameters
Predictor column names or indices. If omitted, all columns except
y are used.Response column name or index. Numeric response trains regression; factor response trains classification.
Training dataset.
Number of trees to build.
Maximum tree depth. Use 0 for unlimited.
Learning rate (shrinkage). Range: 0.0 to 1.0. Lower values require more trees but often generalize better.
Row sample rate per tree. Range: 0.0 to 1.0.
Column sample rate per split. Range: 0.0 to 1.0.
Number of cross-validation folds. 0 disables cross-validation.
Loss distribution. Options:
AUTO, bernoulli, multinomial, gaussian, poisson, gamma, tweedie, laplace, quantile, huber.Early stopping: stop if the metric does not improve for this many scoring rounds.
Minimum number of observations in a leaf node.
Random seed for reproducibility.
-1 uses a time-based seed.h2o.xgboost()
XGBoost — uses the native XGBoost backend for gradient boosted trees. Generally faster thanh2o.gbm() for single-node workloads.
Key parameters
Key parameters
Number of trees (also referred to as
n_estimators).Maximum tree depth.
Step size shrinkage applied after each boosting step.
Subsample ratio of the training data for each tree.
Subsample ratio of columns for each tree.
Minimum number of observations in a leaf (also referred to as
min_child_weight).Loss distribution. Options:
AUTO, bernoulli, multinomial, gaussian, poisson, gamma, tweedie, laplace, quantile, huber.L2 regularization term on leaf weights.
L1 regularization term on leaf weights.
h2o.randomForest()
Distributed Random Forest (DRF) — builds an ensemble of deep, independently-trained decision trees with bootstrap sampling.Key parameters
Key parameters
Number of trees.
Maximum tree depth. Use 0 for unlimited.
Number of columns randomly sampled at each split.
-1 defaults to sqrt(p) for classification and p/3 for regression, where p is the number of predictors.Row sample rate per tree. The default 0.632 matches the classic bootstrap fraction.
Build twice as many trees for binary classification (one per class). Can improve accuracy at the cost of training time.
Minimum observations in a leaf node.
Number of cross-validation folds.
Random seed.
h2o.deeplearning()
Deep Learning (Neural Network) — feed-forward multilayer neural network with adaptive learning rate (ADADELTA by default).Key parameters
Key parameters
Hidden layer sizes. Each element specifies the number of neurons in that layer. Example:
c(128, 64, 32) builds a 3-hidden-layer network.Number of passes over the training data. Can be fractional.
Activation function. Options:
Tanh, TanhWithDropout, Rectifier, RectifierWithDropout, Maxout, MaxoutWithDropout.Dropout rates per hidden layer. Must have the same length as
hidden. Example: c(0.2, 0.2).Dropout ratio for the input layer.
L1 regularization. Induces sparsity.
L2 regularization. Reduces weight magnitude.
Use ADADELTA adaptive learning rate. Set to
FALSE to use a fixed learning rate.Learning rate when
adaptive_rate = FALSE.Standardize numeric inputs to zero mean and unit variance.
Replace the final model with the best-scoring checkpoint found during training.
h2o.glm()
Generalized Linear Model — fits regularized linear models (Lasso, Ridge, Elastic Net) for regression and classification.Key parameters
Key parameters
Response distribution family. Options:
AUTO, gaussian, binomial, multinomial, poisson, gamma, tweedie, negativebinomial, ordinal, quasibinomial, fractionalbinomial.Elastic net mixing:
0 = Ridge (L2 only), 1 = Lasso (L1 only). Default is 0 for L-BFGS solver, 0.5 otherwise.Regularization strength. Larger values produce more regularization.
Search for optimal lambda from
lambda_max down to lambda. Recommended for finding good regularization.Standardize numeric predictors to zero mean and unit variance before fitting.
Optimization algorithm. Options:
AUTO, IRLSM, L_BFGS, COORDINATE_DESCENT, COORDINATE_DESCENT_NAIVE.Compute p-values for coefficients. Only works with the IRLSM solver.
h2o.gam()
Generalized Additive Model — extends GLM with smooth spline terms for non-linear effects.Key parameters
Key parameters
A list of column name vectors specifying which columns to apply GAM smoothers to. Each element can be a single column
c("col1") or multiple columns for interaction splines c("col1", "col2").Response distribution family. Same options as
h2o.glm().Spline basis type for each GAM column.
0 = cubic regression spline, 1 = cyclic cubic regression spline.Number of knots for each GAM column smoother.
Elastic net mixing parameter (same as GLM).
Perform a lambda search (same as GLM).
h2o.automl()
AutoML — automatically trains and tunes multiple models, then ranks them on a leaderboard.Key parameters
Key parameters
Maximum number of individual models to train (excluding Stacked Ensembles). Setting this guarantees reproducibility.
Maximum wall-clock time for the entire AutoML run in seconds.
Maximum time per individual model.
0 disables the per-model limit.Separate holdout frame for leaderboard scoring. If not provided, cross-validation metrics are used.
Cross-validation folds for individual models. Set to
0 to disable (and use validation_frame instead).Algorithms to skip. Options:
"DRF", "GLM", "XGBoost", "GBM", "DeepLearning", "StackedEnsemble".Restrict to only these algorithms. Cannot be used with
exclude_algos.Metric used to rank the leaderboard. Defaults to
AUC for binary classification, mean_per_class_error for multinomial, and mean_residual_deviance for regression.Name for this AutoML run. Models from multiple runs with the same project name are combined into one leaderboard.
Random seed. Set
max_models (not max_runtime_secs) for fully reproducible runs.h2o.kmeans()
K-Means clustering — partitions data intok clusters by minimizing within-cluster sum of squares.
Key parameters
Key parameters
Number of clusters. When
estimate_k = TRUE, this is treated as the maximum.Maximum number of Lloyd’s iterations.
Standardize columns before computing distances.
Initialization strategy. Options:
Random, PlusPlus, Furthest, User.Automatically estimate the number of clusters up to
k.A frame with one row per cluster specifying initial centroid positions. Requires
init = "User".h2o.prcomp()
Principal Component Analysis — reduces dimensionality by projecting data onto principal components.Key parameters
Key parameters
Number of principal components to compute.
Pre-processing transformation. Options:
NONE, STANDARDIZE, NORMALIZE, DEMEAN, DESCALE.Algorithm for PCA computation. Options:
GramSVD, Power, Randomized, GLRM.Include all levels of categorical columns (no reference level dropped).
Impute missing values with column mean before PCA.
h2o.stackedEnsemble()
Stacked Ensemble (Super Learner) — combines predictions from multiple base models using a metalearner.Key parameters
Key parameters
List of trained H2O model objects or model IDs. Each base model must have been trained with
nfolds >= 2 and keep_cross_validation_predictions = TRUE.Algorithm for the metalearner. Options:
AUTO, glm, gbm, drf, deeplearning, naivebayes, xgboost.Cross-validation folds for the metalearner.
Optional holdout frame used to train the metalearner instead of cross-validated predictions.
Retain the level-one frame (metalearner training data) in the cluster.
Common parameters across estimators
These parameters are available on most supervised estimators.| Parameter | Type | Default | Description |
|---|---|---|---|
validation_frame | H2OFrame | — | Frame for computing validation metrics during training |
nfolds | integer | 0 | K-fold cross-validation folds (0 disables) |
fold_assignment | string | AUTO | Fold assignment: AUTO, Random, Modulo, Stratified |
weights_column | string | — | Per-row observation weights |
offset_column | string | — | Offset added to predictions before the link function |
balance_classes | logical | FALSE | Over/under-sample to balance class distribution |
stopping_rounds | integer | 0 | Early stopping patience; 0 disables |
stopping_metric | string | AUTO | Metric for early stopping: AUC, logloss, RMSE, MSE, etc. |
stopping_tolerance | number | 0.001 | Minimum relative improvement to continue training |
max_runtime_secs | number | 0 | Hard time limit for training; 0 disables |
seed | integer | -1 | Random seed; -1 uses time-based seed |
model_id | string | — | Custom key to assign this model in the DKV |
export_checkpoints_dir | string | — | Directory to save model checkpoints during training |