Skip to main content

House Price Prediction

Master machine learning regression techniques with this comprehensive project. Train, evaluate, and compare multiple algorithms to predict house prices with real-world data.

Test R² Score0.850
Test RMSE3.349

Best Model: Decision Tree Regression

Quick Start

Get up and running with house price prediction in minutes

1

Install dependencies

Install the required Python packages using UV or pip.
uv sync
This project requires Python 3.12 or higher.
2

Run data analysis

Explore the Boston Housing dataset and understand feature correlations.
jupyter notebook notebooks/analyze.ipynb
The analysis notebook will show you:
  • Strong positive correlations (rm: 0.695)
  • Strong negative correlations (lstat: -0.738)
  • Feature distributions and outliers
3

Train models

Train multiple regression models and compare their performance.
jupyter notebook notebooks/train.ipynb
The training notebook implements:
  • Linear Regression (univariate and multivariate)
  • Polynomial Regression (degree 2 and 3)
  • Gradient Descent (SGDRegressor)
  • Decision Tree and Neural Network models
4

Compare results

Evaluate and rank all trained models.
python compare_models.py
═══════════════════════════════════════════════════════════════════════════════
                              MODEL COMPARISON OVERVIEW
═══════════════════════════════════════════════════════════════════════════════
   Model Name                              Train R²  Test R²  CV R² Mean  Test RMSE
0  Decision Tree Regression                0.9277    0.8495   0.7239      3.3487
1  Neural Network Regression               0.8456    0.8058   0.7853      3.8044
2  SGD (adaptive)                          0.7421    0.7102   0.6901      4.6466
3  Linear Regression (Multivariate)        0.7432    0.7099   0.6880      4.6496
4  SGD (constant)                          0.7347    0.6940   0.6657      4.7747

Explore by Topic

Deep dive into machine learning concepts and model implementations

Dataset Overview

Understand the Boston Housing dataset with 506 samples and 13 features

Feature Engineering

Learn about feature selection and correlation analysis techniques

Evaluation Metrics

Master MSE, RMSE, MAE, R², and cross-validation scoring

Model Training

Step-by-step guide to training regression models

Available Models

Choose from multiple regression algorithms optimized for different scenarios

Linear Regression

Univariate and multivariate regression with feature selection. Solid baseline performance with R² = 0.710.

Learn more

Polynomial Regression

Capture non-linear relationships with degree 2 and degree 3 polynomial features.

Learn more

Gradient Descent

SGDRegressor with constant and adaptive learning rates for iterative optimization.

Learn more

Advanced Models

Explore Decision Tree and Neural Network (MLP) regression approaches.

Learn more

Ready to start predicting?

Follow the quickstart guide to train multiple models and achieve up to 0.850 R² score with Decision Tree Regression.

Get Started Now

Build docs developers (and LLMs) love