Welcome to Lead Scoring Model
The Lead Scoring Model is a machine learning solution that predicts the probability of lead conversion for Event Management SaaS applications. Built with Gradient Boosting, this model achieves 90.4% accuracy in classifying leads into conversion outcomes.Key Capabilities
High Accuracy Predictions
Gradient Boosting model achieving 90.4% accuracy with 0.91 cross-validation score
Multi-Model Comparison
Compares 12 classification algorithms including Random Forest, AdaBoost, SVM, and Neural Networks
Advanced Data Preprocessing
Automated data fusion, missing value imputation, label encoding, and feature scaling
Production Ready
Integrated with Shimoku API for dashboard visualization and real-time predictions
How It Works
The model processes two primary datasets:- Leads Data - Information about all potential clients including source, use case, acquisition campaign, and demographic details
- Offers Data - Details about clients who reached the demo meeting stage, including pricing, discount codes, pain points, and conversion status
Model Architecture
The system evaluates 12 different classification algorithms:- Ensemble Methods: Random Forest, AdaBoost, Extra Trees, Bagging Classifier, Gradient Boosting
- Tree-based: Decision Tree
- Probabilistic: Naive Bayes
- Instance-based: K-Nearest Neighbors
- Linear Models: Logistic Regression, SGD Classifier
- Neural Networks: Multi-Layer Perceptron
- Support Vector Machines: SVM
Target Classification
The model classifies leads into three categories:- Closed Won - Leads that converted to paying customers
- Closed Lost - Leads that did not convert
- Other - Leads in intermediate states (grouped from minority classes to handle class imbalance)
Get Started
Installation
Set up the project and install dependencies
Quickstart
Train your first model and make predictions