Skip to main content
The GradientBoosting model achieved excellent performance on the test set with 90.4% accuracy, demonstrating strong predictive capability for lead conversion.

Performance Summary

Test Accuracy

90.4%Accuracy on the held-out test set (20% of data)

Cross-Validation Score

0.91Mean 5-fold CV score on training data
The close alignment between cross-validation score (0.91) and test accuracy (90.4%) indicates excellent model generalization without overfitting.

Evaluation Metrics

The model is evaluated on multiple dimensions after training:
# Evaluate model performance on the test set
self.logger.info("Accuracy Score: %s", accuracy_score(y_test, y_pred))
self.logger.info("\nClassification Report:\n%s", classification_report(y_test, y_pred))

Accuracy Score

Accuracy measures the proportion of correct predictions across all classes:
from sklearn.metrics import accuracy_score

accuracy = accuracy_score(y_test, y_pred)
# Result: 0.904 (90.4%)
The model correctly predicts the lead status (Closed Won, Closed Lost, or Other) for 90.4% of test cases.

Classification Report

The classification report provides detailed metrics for each class:

Metrics Breakdown

Precision measures how many predicted positive cases are actually positive.Formula: Precision = True Positives / (True Positives + False Positives)High precision means fewer false alarms when predicting “Closed Won” leads.
Recall measures how many actual positive cases are correctly identified.Formula: Recall = True Positives / (True Positives + False Negatives)High recall ensures the model doesn’t miss many “Closed Won” opportunities.
F1-Score is the harmonic mean of precision and recall.Formula: F1 = 2 × (Precision × Recall) / (Precision + Recall)Provides a balanced measure especially important for imbalanced classes.
The number of actual occurrences of each class in the test set.Helps understand the class distribution and metric reliability.

Classification Report Generation

from sklearn.metrics import classification_report

report = classification_report(y_test, y_pred)
print(report)
The report provides precision, recall, F1-score, and support for each class:
  • Closed Lost (Class 0)
  • Closed Won (Class 1)
  • Other (Class 2)
Detailed classification metrics for each class are logged to reports/model_training.log.

Cross-Validation Methodology

The model uses 5-fold cross-validation during the selection phase:
from sklearn.model_selection import cross_val_score

pipeline = Pipeline([('transformer', ct), (name, model)])
cv_scores = cross_val_score(pipeline, X_train, y_train, cv=5)
mean_score = np.mean(cv_scores)

Cross-Validation Process

1

Data Splitting

Training data is divided into 5 equal folds (subsets).
2

Iterative Training

For each fold:
  • Train on 4 folds (80% of training data)
  • Validate on 1 fold (20% of training data)
3

Score Calculation

Record the accuracy score for each fold.
4

Mean Score

Average the 5 scores to get the final cross-validation score: 0.91
GradientBoosting’s CV score of 0.91 was the highest among all 12 models tested, leading to its selection.

Probability Predictions

Beyond class labels, the model generates probability scores for lead scoring:
# Probabilities for each class
y_probabilities = best_model.predict_proba(X_test)
# Shape: (n_samples, 3) - probabilities for Closed Lost, Closed Won, Other

# Get the predicted class (highest probability)
y_predicted = np.argmax(y_probabilities, axis=1)

Probability Distribution

Leads are categorized into probability ranges:
probability_bins = pd.cut(
    impact_df['Probability Closed-Won'], 
    bins=[0, 0.25, 0.5, 0.75, 1.0], 
    labels=['25%', '50%', '75%', '100%']
)
probability_distribution = probability_bins.value_counts().sort_index()
  • 0-25%: Low conversion probability
  • 25-50%: Medium-low conversion probability
  • 50-75%: Medium-high conversion probability
  • 75-100%: High conversion probability

Prediction Output

The model generates detailed predictions for each lead:
impact_df = pd.DataFrame({
    'Observation': range(1, len(X_test_original) + 1),
    'Use Case': X_test_original['Use Case'],
    'Discount code': X_test_original['Discount code'],
    'Loss Reason': X_test_original['Loss Reason'],
    'Source': X_test_original['Source'],
    'City': X_test_original['City'],
    'Predicted Class': y_predicted_mapped,
    'Probability Closed-Won': y_probabilities[:, 0],
})

Class Mapping

mapping = {0: 'Closed Lost', 1: 'Closed Won', 2: 'Other'}
y_predicted_mapped = np.vectorize(mapping.get)(y_predicted)
Numeric predictions are converted to readable labels for business interpretation.

Evaluation Results Data

The training pipeline returns comprehensive evaluation data:
data = {
    "predictions_df": predictions_df,
    "probability_distribution": probability_distribution,
    "accuracy_score": accuracy_score(y_test, y_pred),
}
return data

Predictions DataFrame

Individual predictions with features and probabilities

Probability Distribution

Count of leads in each probability range

Accuracy Score

Overall model accuracy: 90.4%

Model Performance Highlights

Excellent Generalization: CV score (0.91) and test accuracy (90.4%) are closely aligned
Multi-Class Performance: Effective prediction across all three status categories
Probability Scoring: Generates calibrated probability scores for lead prioritization
Reproducible Results: All metrics logged to model_training.log for tracking

Running Evaluation

To train and evaluate the model:
python3 -m src.models.train_model
This command executes the complete training pipeline:
  1. Loads processed data from data/processed/full_dataset.csv
  2. Splits data into train/test sets
  3. Compares all 12 classification models
  4. Selects the best model (GradientBoosting)
  5. Trains on full training set
  6. Evaluates on test set
  7. Logs all metrics to reports/model_training.log

Next Steps

Model Selection

Learn how GradientBoosting was selected

Training Overview

Review the complete training pipeline

Build docs developers (and LLMs) love