Saving & Loading Models

H2O-3 provides several functions for persisting trained models. The right approach depends on whether you need the model for continued H2O use (binary model) or for production scoring without an H2O cluster (MOJO).

Binary models

A binary model saves the full H2O model object, including all internal state. Binary models are version-specific: a model saved with H2O version X can only be loaded with the same version X.

Binary models are not compatible across H2O versions. If you upgrade H2O, you must retrain and re-save your models. For production deployment, use MOJO or POJO format instead.

Saving and loading locally

import h2o
from h2o.estimators import H2ODeepLearningEstimator
h2o.init()

# Train a model
model = H2ODeepLearningEstimator()
model.train(params)

# Save to /tmp/mymodel/
model_path = h2o.save_model(model=model, path="/tmp/mymodel", force=True)
print(model_path)
# /tmp/mymodel/DeepLearning_model_python_1441838096933

# Load the saved model
saved_model = h2o.load_model(model_path)

h2o.save_model() parameters

Parameter	Description
`model`	The H2O model object to save.
`path`	Directory path to save to. Supports local paths, `hdfs://`, `s3://`, and `gs://`. Defaults to the current working directory.
`force`	If `True`, overwrite existing files at the destination.
`export_cross_validation_predictions`	If `True`, include CV holdout predictions in the saved artifact.
`filename`	Custom filename for the saved model. Defaults to `model.model_id`.

Downloading and uploading models

Use download_model() and upload_model() when the H2O cluster is remote and you need to transfer model files to or from the local machine running your Python/R session.

# Download model from the H2O cluster to your local machine
my_local_model = h2o.download_model(model, path="/Users/UserName/Desktop")

# Upload a previously downloaded model back to the H2O cluster
uploaded_model = h2o.upload_model(my_local_model)

The owner of a saved file (via save_model) is the user running the H2O cluster process. The owner of a downloaded file (via download_model) is the user running the Python/R session.

Saving to cloud storage

Prefix the path with the appropriate URI scheme to save directly to distributed storage.

HDFS
S3
GCS

hdfs_name_node = "node-1"
hdfs_model_path = "hdfs://" + hdfs_name_node + "/tmp/models"
new_model_path = h2o.save_model(h2o_glm, hdfs_model_path)

s3_path = "s3://my-bucket/models/"
model_path = h2o.save_model(model=model, path=s3_path, force=True)
saved_model = h2o.load_model(model_path)

gcs_path = "gs://my-bucket/models/"
model_path = h2o.save_model(model=model, path=gcs_path, force=True)
saved_model = h2o.load_model(model_path)

MOJO models

A MOJO (Model Object, Optimized) is a portable, self-contained model archive. Unlike binary models, MOJOs:

Do not require an H2O cluster to score
Are not tied to a specific H2O version
Can be deployed in Java environments via the h2o-genmodel library

Use MOJOs for production deployment. They are more compact and faster than POJOs, and support the widest range of algorithms.

Supported algorithms

The following algorithms support MOJO export and/or import:

Algorithm	Exportable	Importable
GBM	Yes	Yes
DRF	Yes	Yes
GLM	Yes	Yes
XGBoost	Yes	Yes
Deep Learning	Yes	Yes
Stacked Ensemble	Yes	Yes
AutoML	Yes	Yes
GAM	Yes	Yes
CoxPH	Yes	Yes
RuleFit	Yes	Yes
Uplift DRF	Yes	Yes
Isolation Forest	Yes	Yes
Extended Isolation Forest	Yes	Yes
GLRM	Yes	No
PCA	Yes	No
K-Means	Yes	No
Naïve Bayes	No	No
SVM	No	No

AutoML will always produce a model with a MOJO, though the exact model type depends on the run. In most cases you will get a Stacked Ensemble. All individual models within an AutoML run are importable, but only individual (non-AutoML) models are exportable as MOJOs.

Saving and importing MOJOs

import h2o
from h2o.estimators.glm import H2OGeneralizedLinearEstimator
h2o.init()

data = h2o.import_file(path='training_dataset.csv')
original_model = H2OGeneralizedLinearEstimator()
original_model.train(
    x=["Some column", "Another column"],
    y="response",
    training_frame=data
)

# Save as MOJO
path = '/path/to/model/directory/model.zip'
original_model.save_mojo(path)

# Import the MOJO back into H2O for scoring
imported_model = h2o.import_mojo(path)
new_observations = h2o.import_file(path='new_observations.csv')
predictions = imported_model.predict(new_observations)

Downloading and uploading MOJOs

Use download_mojo() and upload_mojo() when the H2O cluster is remote.

import h2o
from h2o.estimators import H2OGradientBoostingEstimator
h2o.init()

df = h2o.import_file("http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_wheader.csv")
model = H2OGradientBoostingEstimator()
model.train(x=list(range(4)), y="class", training_frame=df)

# Download MOJO to local machine
my_mojo = model.download_mojo(path="/Users/UserName/Desktop")

# Upload the MOJO to the H2O cluster
mojo_model = h2o.upload_mojo(my_mojo)

Model checkpointing

Several H2O-3 algorithms (GBM, DRF, Deep Learning, XGBoost) support checkpointing: resuming training from a previously saved model. Pass the checkpoint parameter with the model ID of an existing model.

from h2o.estimators.gbm import H2OGradientBoostingEstimator

# Initial training run (50 trees)
gbm_v1 = H2OGradientBoostingEstimator(ntrees=50, seed=42)
gbm_v1.train(x=predictors, y=response, training_frame=train)

# Save the model
model_path = h2o.save_model(gbm_v1, path="/tmp/checkpoints", force=True)

# Resume training from the checkpoint (add 50 more trees)
gbm_v2 = H2OGradientBoostingEstimator(
    ntrees=100,           # total trees including the checkpoint
    checkpoint=gbm_v1.model_id,
    seed=42
)
gbm_v2.train(x=predictors, y=response, training_frame=train)

Checkpointing is useful for incrementally training large models without restarting from scratch, or for fine-tuning a base model on new data.

Advanced: lazy MOJO import with Generic model

The Generic model provides fine-grained control over MOJO loading. Upload the MOJO bytes once and instantiate multiple scored models from the same upload without re-uploading.

import h2o
h2o.init()

# Download MOJO from original model
path = '/path/to/model/directory/model.zip'
original_model.download_mojo(path)

# Upload MOJO bytes once (lazy import)
imported_mojo_key = h2o.lazy_import(path)

# Build the generic model from already-uploaded bytes
from h2o.estimators import H2OGenericEstimator
generic_model = H2OGenericEstimator(
    model_key=h2o.get_frame(imported_mojo_key[0])
)
new_observations = h2o.import_file(path='new_observations.csv')
predictions = generic_model.predict(new_observations)

Get Started

Core Concepts

Algorithms

Model Workflows

Deployment

Saving & Loading Models

Binary models

Saving and loading locally

h2o.save_model() parameters

Downloading and uploading models

Saving to cloud storage

MOJO models

Supported algorithms

Saving and importing MOJOs

Downloading and uploading MOJOs

Model checkpointing

Advanced: lazy MOJO import with Generic model

Build docs developers (and LLMs) love

Get Started

Core Concepts

Algorithms

Model Workflows

Deployment

​Binary models

​Saving and loading locally

​h2o.save_model() parameters

​Downloading and uploading models

​Saving to cloud storage

​MOJO models

​Supported algorithms

​Saving and importing MOJOs

​Downloading and uploading MOJOs

​Model checkpointing

​Advanced: lazy MOJO import with Generic model

Build docs developers (and LLMs) love

Binary models

Saving and loading locally

h2o.save_model() parameters

Downloading and uploading models

Saving to cloud storage

MOJO models

Supported algorithms

Saving and importing MOJOs

Downloading and uploading MOJOs

Model checkpointing

Advanced: lazy MOJO import with Generic model