AutoML is available in Python (
H2OAutoML), R (h2o.automl()), and the H2O Flow web UI.How AutoML Works
Data ingestion
Point AutoML at a training frame and identify the response column. Everything else is optional.
Model training
AutoML sequentially trains individual models (GLM, GBM, XGBoost, DRF, XRT, Deep Learning) and performs random grid searches with hyperparameter optimization for the strongest algorithms.
Stacked Ensembles
After base model training, AutoML builds two Stacked Ensemble models: one using all base models and one using only the best model from each algorithm family.
Parameters
Required Parameters
The name (or index) of the response column. For binary classification, this column must be a factor.
The dataset used to build the models. At least one of
max_runtime_secs or max_models must also be set (or AutoML defaults to 1 hour).Maximum wall-clock time (seconds) for the entire AutoML run. Dynamically defaults to 3600 if neither
max_runtime_secs nor max_models is specified.Maximum number of models to train (excluding Stacked Ensembles). Set this for reproducible runs — all models are then trained to convergence rather than being cut off by a time budget.
Key Optional Parameters
Number of cross-validation folds.
-1 lets AutoML decide (typically 5-fold CV or blending mode depending on data size). Set to 0 to disable CV (also disables Stacked Ensembles).Random seed for reproducibility. Reproducibility requires using
max_models (not max_runtime_secs) and excluding DeepLearning (which is non-deterministic by default).List of algorithms to skip. Mutually exclusive with
include_algos. Example: ["GLM", "DeepLearning"]. Available values: "DRF", "GLM", "XGBoost", "GBM", "DeepLearning", "StackedEnsemble".Allowlist of algorithms. Mutually exclusive with
exclude_algos. Same values as above.Metric used to rank the leaderboard.
AUTO defaults to AUC for binary, mean_per_class_error for multinomial, and deviance for regression. Other options: "logloss", "MSE", "RMSE", "MAE".Backend logging verbosity. One of
"debug", "info", "warn". Set to "info" to see per-model progress during training.Code Examples
Accessing and Interpreting the Leaderboard
The leaderboard is anH2OFrame sorted by the sort_metric. Each row represents one trained model.
Python
R
Getting the Best Model by Algorithm
Python
Explainability
H2O AutoML integrates with H2O’s explainability framework. Callexplain() on the AutoML object to generate an automated report covering variable importance, SHAP values, partial dependence plots, and model correlation.
Python
R
explain() returns an object with individual plots (variable importance heatmap, model correlation heatmap, SHAP summary, PDP/ICE plots). In a Jupyter notebook, plots render inline automatically.