Skip to main content
Model validation is essential for understanding how well your machine learning models perform. MLPP provides a complete suite of tools for evaluating classification models, from basic confusion matrices to advanced cross-validation techniques.

Why validation matters

Validation helps you:
  • Measure performance: Quantify how well your model generalizes to unseen data
  • Compare models: Make objective comparisons between different algorithms or hyperparameters
  • Detect overfitting: Identify when your model memorizes training data instead of learning patterns
  • Handle class imbalance: Use stratified splitting to ensure fair evaluation across all classes

Core components

Confusion matrix

The ConfusionMatrix class tracks prediction counts for multi-class classification problems. Rows represent true labels, columns represent predicted labels.
#include "Model Validation/confusion_matrix.hpp"

using namespace mlpp::model_validation;

// Create a confusion matrix for 3 classes
ConfusionMatrix<std::size_t> cm(3);

// Update with predictions
cm.update(0, 0);  // Correct prediction
cm.update(0, 1);  // Predicted class 1, actually class 0
cm.update(1, 1);  // Correct prediction

// Access statistics
auto accuracy = double(cm.trace()) / cm.total();
cm.print();
The confusion matrix stores counts internally and provides efficient access via operator[] for computing derived metrics.

Metrics

The Metrics class computes precision, recall, F1 score, IoU, and macro/micro averages from any confusion matrix. See the metrics page for details.

Cross-validation

Stratified k-fold cross-validation and ROC analysis help you estimate model performance on held-out data. See the cross-validation page for implementation details.

Typical workflow

  1. Build a confusion matrix: Track predictions during evaluation
  2. Compute metrics: Calculate precision, recall, F1, and other performance measures
  3. Use cross-validation: Get robust estimates with stratified k-fold splitting
  4. Analyze ROC curves: Evaluate threshold-agnostic performance for binary tasks
For imbalanced datasets, prefer metrics like F1 score or IoU over raw accuracy, and always use stratified splitting in cross-validation.

Template parameters

All validation classes use C++ templates for flexibility:
  • T: Arithmetic type for counts (std::size_t, int, double)
  • Label: Integer-like label type (int, std::size_t, enum)
  • Score: Floating-point type for classifier scores (float, double)
This design supports both integer counts and weighted/probabilistic evaluation without runtime overhead.

Build docs developers (and LLMs) love