Overview
Polynomial regression extends linear regression by creating polynomial features, allowing the model to capture non-linear relationships. This project implements degree 2 and degree 3 polynomial transformations on the univariaterm (rooms) feature.
Performance: Polynomial degree 3 achieves Test R² of 0.583, outperforming simple univariate linear regression (0.458) but still below multivariate linear regression (0.710).
How Polynomial Features Work
Starting with a single featurerm, polynomial transformation creates additional features:
Degree 2: Creates rm²Degree 3: Creates
rm² and rm³
This allows the model to fit curves instead of straight lines.
Mathematical Formula
- Degree 2
- Degree 3
Implementation
Performance Results
Degree 2 Performance
- Train R²: 0.536
- Test R²: 0.567
- Train RMSE: 6.385
- Test RMSE: 5.679
- CV R² (mean±std): 0.483 ± 0.224
Degree 2 shows slight improvement over univariate linear regression (Test R² 0.567 vs 0.458).
Degree 3 Performance
- Train R²: 0.549
- Test R²: 0.583 ⭐
- Train RMSE: 6.296
- Test RMSE: 5.577
- CV R² (mean±std): 0.491 ± 0.205
Degree 3 performs slightly better than degree 2, with Test R² of 0.583.
Comparison: Polynomial vs Linear
| Model | Test R² | Test RMSE | CV R² |
|---|---|---|---|
| Linear (Univariate) | 0.458 | 6.355 | 0.452 ± 0.177 |
| Polynomial (degree=2) | 0.567 | 5.679 | 0.483 ± 0.224 |
| Polynomial (degree=3) | 0.583 | 5.577 | 0.491 ± 0.205 |
| Linear (Multivariate) | 0.710 🏆 | 4.650 🏆 | 0.688 ± 0.092 🏆 |
Overfitting Risk
Polynomial regression is prone to overfitting, especially with higher degrees:Degree Comparison
Interestingly, both polynomial models show better test performance than training performance, indicating they generalize well without overfitting.
Why Not Higher Degrees?
Higher polynomial degrees (4, 5, 6+) were not tested because:- Overfitting risk: Higher degrees fit training data too closely
- Diminishing returns: Degree 3 already shows limited improvement
- Better alternatives: Multivariate linear regression already outperforms polynomial approaches
Feature Transformation Visualization
When to Use Polynomial Regression
Good Use Cases
- Single feature with clear non-linear pattern
- Visualizing curved relationships
- Educational purposes
Not Recommended
- When multiple features are available (use multivariate instead)
- High-dimensional data (risk of overfitting)
- Need for interpretability
Key Takeaways
- Polynomial features improve univariate models: Degree 3 achieves R² of 0.583 vs 0.458 for linear
- But don’t beat multivariate linear: Multivariate linear regression (0.710) still performs better
- No overfitting observed: Both degrees generalize well to test data
- Diminishing returns: Degree 3 only slightly better than degree 2
- Use more features: Adding features is more effective than polynomial transformation
Next Steps
Linear Regression
Compare with the best-performing multivariate linear model
Gradient Descent
Explore optimization techniques with SGDRegressor