Model Comparison Workflow
After training multiple models, use this workflow to systematically compare their performance and select the best model for deployment.Prerequisites
You must complete the Model Training Workflow before running model comparison. This creates the required metrics files in the
results/ directory.Running the Comparison Script
Ensure models are trained
Verify that the You should see at least 9 metrics files from the training workflow.
results/ directory exists and contains metrics files:Run the comparison script
- Loads all metrics JSON files from
results/ - Compares models by Test R² score
- Ranks models from best to worst
- Identifies the champion model
Understanding the Output
Comparison Table
Here’s the actual output from running the comparison on trained models:How Models Are Ranked
Models are sorted by Test R² (descending), because:- Test R² shows generalization to unseen data (most important)
- Train R² alone can be misleading due to overfitting
- CV R² Mean provides additional confidence in model stability
A good model should have:
- High Test R² (> 0.7 is excellent for this dataset)
- Small gap between Train R² and Test R² (< 0.1 difference)
- Consistent CV R² with low standard deviation
Interpreting the Results
Top 3 Models Analysis
🥇 #1: Decision Tree Regression
🥇 #1: Decision Tree Regression
Test R²: 0.8495 (Best performer)Strengths:
- Highest test set accuracy (84.95% variance explained)
- Can capture non-linear relationships automatically
- No feature scaling required
- Interpretable feature importance
- Train R² of 0.9277 vs Test R² of 0.8495 shows slight overfitting
- CV R² of 0.7239 is lower than test R², suggesting some variability
- May not extrapolate well beyond training data range
🥈 #2: Neural Network Regression
🥈 #2: Neural Network Regression
Test R²: 0.8058 (Strong second place)Strengths:
- Excellent CV R² of 0.7853 (highest stability)
- Smallest standard deviation in cross-validation (0.1093)
- Can learn complex feature interactions
- Better generalization than Decision Tree (lower train/test gap)
- Requires feature scaling (use saved scaler.joblib)
- Less interpretable than linear models
- Longer training time
🥉 #3: SGD (adaptive) & Linear Regression (Multivariate)
🥉 #3: SGD (adaptive) & Linear Regression (Multivariate)
Test R²: ~0.71 (Tied performance)Strengths:
- Fast training and prediction
- Highly interpretable coefficients
- Memory efficient
- Good baseline performance
- Cannot capture non-linear relationships
- Lower accuracy than tree-based and neural models
Bottom Performers
Why did some models underperform?
Why did some models underperform?
Univariate Linear Regression (Test R²: 0.458)
- Uses only ONE feature (
rm- rooms) - Missing critical predictors like
lstat,ptratio - Demonstrates the value of multivariate modeling
- Polynomial features of a single variable (
rm) aren’t enough - Would perform better with multivariate polynomial features
- Shows that feature engineering alone doesn’t replace good feature selection
How to Use the Comparison Results
1. Select Your Model Based on Use Case
Need Best Accuracy?
Choose: Decision Tree RegressionFile:
results/decision_tree.joblibNeed Stability?
Choose: Neural Network RegressionFile:
results/neural_network.joblibNeed Interpretability?
Choose: Linear Regression (Multivariate)File:
results/linear_multivariate.joblib2. Load the Best Model
3. Validate Model Performance
Verify the model performs as expected on your own test set:Advanced Comparison Techniques
Custom Ranking by Multiple Criteria
If you want to balance multiple factors:Visualize Model Performance
Key Insights from Results
Winner: Decision Tree Regression achieves 84.95% accuracy on test data, outperforming all other models by a significant margin.Runner-up: Neural Network shows the most consistent performance across cross-validation folds (CV R² = 0.7853 ± 0.1093).Baseline: Even simple Linear Regression achieves 71% accuracy, proving the dataset has strong linear patterns.
Next Steps
Deploy the Model
Learn how to serve predictions in production
API Reference
Integrate predictions into your application
Model Registry
Version and track your models
Monitor Performance
Set up performance tracking
Troubleshooting
No metrics files found
No metrics files found
Error:
No metrics JSON files found in the results directory.Solution:- Run the training notebook first:
jupyter notebook train.ipynb - Execute all cells to completion
- Verify files exist:
ls results/metrics_*.json
Results directory not found
Results directory not found
Error:
Results directory not found. Have you trained the models yet?Solution:UnicodeDecodeError when displaying output
UnicodeDecodeError when displaying output
If you see emoji rendering issues:Solution:Already included in the script!
Different results than shown in this guide
Different results than shown in this guide
Model performance can vary slightly due to:
- Different random seeds
- Software version differences
- Data preprocessing variations