Skip to main content

Function Signature

def main() -> None

Description

The main() function in compare_models.py loads all saved model metrics from the results/ directory, compares them side-by-side in a formatted table, and identifies the best-performing model based on Test R² score.

How It Works

Input Requirements

results/
directory
required
A directory containing JSON files with model metrics. Each file must:
  • Follow naming convention: metrics_*.json (e.g., metrics_linear_regression.json)
  • Contain metrics generated by evaluate_model() function
  • Include at minimum: model_name, train_r2, test_r2, train_rmse, test_rmse

Example Metrics File Structure

{
  "model_name": "Linear Regression (Multivariate)",
  "train_mse": 22.5631,
  "test_mse": 21.6191,
  "train_rmse": 4.7508,
  "test_rmse": 4.6496,
  "train_mae": 3.4521,
  "test_mae": 3.3874,
  "train_r2": 0.7432,
  "test_r2": 0.7099,
  "cv_r2_mean": 0.6880,
  "cv_r2_std": 0.0521
}

Output Format

The function displays a comprehensive comparison table:
===============================================================================================
                                   MODEL COMPARISON OVERVIEW
===============================================================================================
   Model Name                              Train R²  Test R²  CV R² Mean  Train RMSE  Test RMSE
0  Decision Tree Regression                 0.9277   0.8495      0.7239      2.5206     3.3487
1  Neural Network Regression                0.8456   0.8058      0.7853      3.6842     3.8044
2  SGD (adaptive)                           0.7421   0.7102      0.6901      4.7609     4.6466
3  Linear Regression (Multivariate)         0.7432   0.7099      0.6880      4.7508     4.6496
4  SGD (constant)                           0.7347   0.6940      0.6657      4.8291     4.7747
5  Linear Regression (Feature Selection)    0.6873   0.6511      0.6512      5.2428     5.0990
6  Polynomial Regression (degree=3)         0.5491   0.5825      0.4908      6.2955     5.5773
7  Polynomial Regression (degree=2)         0.5362   0.5672      0.4829      6.3851     5.6791
8  Linear Regression (Univariate)           0.4887   0.4580      0.4524      6.7039     6.3550
-----------------------------------------------------------------------------------------------

🏆 THE BEST MODEL IS:
👉 Name:         Decision Tree Regression
👉 Test R²:      0.8495
👉 CV R² Mean:   0.7239
👉 Test RMSE:    3.3487
===============================================================================================

Ranking Criteria

Primary Metric: Test R² (Descending) Models are ranked by their Test R² score because:
  • Test R² measures performance on unseen data, indicating real-world generalization
  • Higher values (closer to 1.0) indicate better predictive accuracy
  • More reliable than Train R² for assessing true model quality
  • Prevents selection of overfitted models that only perform well on training data

Running the Script

Command Line

python compare_models.py

From Python Code

import sys
sys.path.append('path/to/script')
from compare_models import main

main()

Complete Workflow Example

import json
from pathlib import Path
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
import pandas as pd

# 1. Prepare data
df = pd.read_csv('house_prices.csv')
X = df[['area', 'bedrooms', 'bathrooms', 'age']]
y = df['price']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# 2. Create results directory
Path('results').mkdir(exist_ok=True)

# 3. Train and save metrics for multiple models
models = [
    ('Linear Regression', LinearRegression()),
    ('Decision Tree', DecisionTreeRegressor(max_depth=10)),
]

for name, model in models:
    # Train model
    model.fit(X_train, y_train)
    
    # Evaluate and get metrics
    metrics, _ = evaluate_model(
        model, X_train, X_test, y_train, y_test, name
    )
    
    # Save metrics to JSON
    filename = f"results/metrics_{name.lower().replace(' ', '_')}.json"
    with open(filename, 'w') as f:
        json.dump(metrics, f, indent=2)
    
    print(f"Saved metrics for {name}")

# 4. Compare all models
print("\n" + "="*80)
print("COMPARING ALL MODELS")
print("="*80 + "\n")

from compare_models import main
main()

Error Handling

Use Cases

  1. Model Selection - Quickly identify which model performs best on your dataset
  2. Experiment Tracking - Compare results across different hyperparameter configurations
  3. Documentation - Generate comparison tables for reports and presentations
  4. MLOps - Automate model selection in training pipelines

Build docs developers (and LLMs) love