The Metrics class provides a comprehensive set of classification metrics derived from a confusion matrix. It supports both per-class metrics and aggregated macro/micro averages.
Basic usage
#include "Model Validation/confusion_matrix.hpp"
#include "Model Validation/metrics.h"
using namespace mlpp::model_validation;
// Build confusion matrix
ConfusionMatrix<std::size_t> cm(3);
cm.update(0, 0);
cm.update(0, 1);
cm.update(1, 1);
cm.update(2, 2);
// Compute metrics
Metrics metrics(cm);
// Per-class metrics
double p0 = metrics.precision(0);
double r0 = metrics.recall(0);
double f1_0 = metrics.f1(0);
// Aggregated metrics
double macro_f1 = metrics.macro_f1();
double micro_f1 = metrics.micro_f1();
Per-class metrics
All per-class metrics accept a class index k (zero-based).
Precision
double precision(std::size_t k) const noexcept;
Precision measures what fraction of predicted positives are actually positive:
Precision = TP / (TP + FP)
High precision means few false positives.
Recall
double recall(std::size_t k) const noexcept;
Recall (sensitivity, true positive rate) measures what fraction of actual positives are correctly identified:
High recall means few false negatives.
F1 score
double f1(std::size_t k) const noexcept;
F1 score is the harmonic mean of precision and recall:
F1 = 2 * (Precision * Recall) / (Precision + Recall)
F1 balances both precision and recall, reaching its best value at 1.0 and worst at 0.0.
F1 score is more informative than accuracy for imbalanced datasets because it accounts for both false positives and false negatives.
IoU (Intersection over Union)
double iou(std::size_t k) const noexcept;
IoU measures overlap between predicted and actual positives:
IoU = TP / (TP + FP + FN)
Commonly used in segmentation tasks and object detection.
Macro averages
Macro averages compute the metric for each class independently, then take the unweighted mean. This treats all classes equally regardless of support.
Macro precision
double macro_precision() const noexcept;
Unweighted mean of per-class precision values.
Macro recall
double macro_recall() const noexcept;
Unweighted mean of per-class recall values.
Macro F1
double macro_f1() const noexcept;
Unweighted mean of per-class F1 scores.
Mean IoU
double mean_iou() const noexcept;
Unweighted mean of per-class IoU values.
Macro averaging gives equal weight to all classes, making it sensitive to performance on rare classes. Use this when all classes are equally important.
Micro averages
Micro averages aggregate counts across all classes first, then compute the metric. This weights classes by their support (number of samples).
Micro precision
double micro_precision() const noexcept;
Computes precision from global TP and FP counts:
Micro Precision = Σ(TP_k) / (Σ(TP_k) + Σ(FP_k))
Micro recall
double micro_recall() const noexcept;
Computes recall from global TP and FN counts:
Micro Recall = Σ(TP_k) / (Σ(TP_k) + Σ(FN_k))
Micro F1
double micro_f1() const noexcept;
Harmonic mean of micro precision and micro recall.
For multi-class problems, micro precision equals micro recall (both equal accuracy). Micro averaging weights classes by support, so it emphasizes performance on frequent classes.
Basic counts
The Metrics class also exposes raw count accessors:
T tp(std::size_t k) const noexcept; // True positives for class k
T fp(std::size_t k) const noexcept; // False positives for class k
T fn(std::size_t k) const noexcept; // False negatives for class k
These are computed from the confusion matrix:
tp(k): Diagonal element cm[k][k]
fp(k): Sum of column k excluding diagonal
fn(k): Sum of row k excluding diagonal
Template requirements
The Metrics class is templated on the confusion matrix type:
template<typename CM>
class Metrics {
using T = typename std::remove_cvref_t<CM>::value_type;
// ...
};
This works with any ConfusionMatrix<T, Label> instantiation, automatically deducing the count type T.
Example: Multi-class evaluation
#include "Model Validation/confusion_matrix.hpp"
#include "Model Validation/metrics.h"
#include <iostream>
using namespace mlpp::model_validation;
int main() {
// Simulate predictions for 3-class problem
std::vector<int> y_true = {0, 0, 1, 1, 2, 2};
std::vector<int> y_pred = {0, 1, 1, 1, 2, 0};
ConfusionMatrix<std::size_t, int> cm(3);
for (size_t i = 0; i < y_true.size(); ++i) {
cm.update(y_true[i], y_pred[i]);
}
Metrics metrics(cm);
std::cout << "Per-class metrics:\n";
for (size_t k = 0; k < 3; ++k) {
std::cout << "Class " << k << ": "
<< "P=" << metrics.precision(k) << " "
<< "R=" << metrics.recall(k) << " "
<< "F1=" << metrics.f1(k) << "\n";
}
std::cout << "\nAggregated metrics:\n";
std::cout << "Macro F1: " << metrics.macro_f1() << "\n";
std::cout << "Micro F1: " << metrics.micro_f1() << "\n";
return 0;
}