Scoring approaches
MOJO / POJO (embedded scoring)
MOJO / POJO (embedded scoring)
Export the model as a MOJO or POJO and embed it directly inside your application JVM. The only runtime dependency is
h2o-genmodel.jar. No H2O cluster is needed at scoring time.- Best for: low-latency real-time scoring, environments where running a separate server is impractical.
- Latency: sub-millisecond per row for tree models.
- See MOJO & POJO Export for download and compilation instructions.
H2O cluster REST endpoint
H2O cluster REST endpoint
Keep H2O running and call the
/3/Predictions REST endpoint with new data rows. The cluster handles scoring in-process.- Best for: interactive or exploratory scoring, batch jobs where data already lives in H2O frames.
- Latency: network round-trip + H2O overhead (~10–100 ms per request).
AWS scoring server
AWS scoring server
Deploy H2O’s inference server AMI from the AWS Marketplace. Copy your MOJO into Transfer the MOJO from S3 using instance userdata:
/tmp/pipeline.mojo before launch, then query the REST endpoint:cloud-boothook userdata
Real-time vs batch scoring
| Real-time | Batch | |
|---|---|---|
| Scoring trigger | Single row / API request | Dataset / scheduled job |
| Latency SLA | Milliseconds | Minutes to hours |
| Recommended engine | MOJO embedded in servlet or Lambda | POJO or MOJO via Hive UDF, Spark map |
| Infrastructure | Jetty servlet, Spring Boot, AWS Lambda | Hadoop MapReduce, Spark, Hive |
Design patterns
Jetty servlet
Embed a MOJO inside a Jetty web application to expose a REST scoring endpoint. Client-side Javascript sends user input; the servlet scores it with the MOJO and returns a prediction. Example: Consumer loan application — github.com/h2oai/app-consumer-loan| Characteristic | Value |
|---|---|
| Training language | R |
| Scoring data source | User input from browser |
| Scoring engine | H2O POJO |
| Latency SLA | Real-time |
Sparkling Water streaming
Use H2O inside Apache Spark (Sparkling Water) for streaming scoring scenarios. The H2O cluster provides model training while Spark handles data pipelines. Example: Craigslist application — github.com/h2oai/app-ask-craig| Characteristic | Value |
|---|---|
| Training language | Scala |
| Scoring engine | H2O cluster |
| Latency SLA | Real-time |
AWS Lambda
Package a POJO andh2o-genmodel.jar into an AWS Lambda function for serverless real-time scoring with automatic scaling.
Example: Malicious domain detection — github.com/h2oai/app-malicious-domains
Hive UDF
Embed a POJO or MOJO in a Hive User Defined Function. The UDF runs in parallel across MapReduce tasks, enabling batch scoring directly inside Hive SELECT queries. Example: h2o-tutorials/hive_udf_templateApache Storm bolt
Embed a POJO inside a Storm bolt for streaming event scoring. The POJO is invoked on each tuple as it passes through the topology. Example: h2o-tutorials/streaming/stormMOJO as a JAR resource
Bundle the MOJO zip inside your application jar as a classpath resource. At runtime, load it withMojoModel.load() pointing to the resource path. This makes the model portable — it travels with the application artifact.
Example: h2o-tutorials/mojo-resource
Downloading and scoring: end-to-end example
Performance considerations
These guidelines apply to MOJO/POJO scoring. For H2O cluster REST scoring, latency is dominated by network overhead and frame parsing.
- Use MOJOs over POJOs for large models. At 5000 trees / depth 25, MOJOs are 20–25× smaller on disk and 2–3× faster during hot scoring.
- Reuse
EasyPredictModelWrapper— loading a model from disk is expensive. Create one instance per model and share it across threads (both MOJOs and POJOs are thread safe). - Pre-allocate
RowDataobjects and reuse them when possible to reduce GC pressure in high-throughput paths. - Disable optional features (leaf node assignments, Shapley contributions) unless you need them — each adds per-prediction computation.
- JVM warm-up: cold scoring (first predictions after JVM start) is significantly slower than hot scoring. For latency-sensitive services, pre-warm the JVM by scoring a few dummy rows at startup.