Overview
The training pipeline orchestrates the end-to-end model training process, including data loading, preprocessing, model initialization, training loop execution, evaluation, and artifact logging with MLflow.Core Functions
train()
Main training orchestration function that coordinates the entire training workflow.Parsed command-line arguments containing:
config: Path to YAML configuration file
training/training.py:51
Workflow Steps
-
Configuration Loading
- Loads YAML configuration using
load_config() - Extracts configuration name from file path
- Sets up MLflow experiment tracking
- Loads YAML configuration using
-
Data Pipeline
- Loads training dataset from CSV
- Applies preprocessing transformations
- Saves preprocessor for inference
- Splits into train/test sets
- Converts to PyTorch tensors
-
Model Initialization
- Creates
ModelConfigfrom YAML parameters - Initializes
CreditScoreModelwith configuration - Sets up BCEWithLogitsLoss criterion
- Configures AdamW optimizer
- Creates
-
Training Loop
- Iterates for specified epochs
- Processes data in batches via DataLoader
- Performs forward pass, loss calculation, backward pass
- Updates model weights with optimizer
- Logs metrics to MLflow at each epoch
-
Evaluation & Artifacts
- Generates predictions on test set
- Computes evaluation metrics
- Creates visualization artifacts
- Saves model weights
- Logs all artifacts to MLflow
load_config()
Loads and parses YAML configuration files for training experiments.Absolute or relative path to YAML configuration file
dict - Parsed configuration dictionary
Location: training/training.py:40
Configuration files must be valid YAML format. Invalid files will raise parsing exceptions.
Training Loop Details
Epoch Iteration
The training loop runs for the number of epochs specified in configuration:train_loss: Average loss across all batchestrain_accuracy: Proportion of correct predictions
Evaluation Metrics
After training completes, the model is evaluated on the test set with the following metrics:Overall classification accuracy
Area Under the ROC Curve - measures discriminative ability
Proportion of positive predictions that are correct
Proportion of actual positives correctly identified
Harmonic mean of precision and recall
Evaluation Process
Artifacts Generated
The training pipeline generates and logs the following artifacts to MLflow:Visualizations
-
Confusion Matrix (
confusion_matrix.png)- Heatmap showing true vs predicted labels
- Annotated with counts
-
ROC Curve (
roc_curve.png)- True Positive Rate vs False Positive Rate
- Includes AUC score in legend
-
Precision-Recall Curve (
precision_recall_curve.png)- Trade-off between precision and recall
Reports
- Classification Report (
classification_report.txt)- Per-class precision, recall, F1-score
- Support counts for each class
Model Files
-
Model Weights (
model_weights_001.pth)- PyTorch state dictionary
- Saved to
model/directory - Name derived from configuration file
-
Preprocessor (
preprocessor.joblib)- Fitted preprocessing pipeline
- Saved to
processing/directory - Required for inference
All artifacts are automatically logged to the active MLflow run and can be retrieved for model deployment.
MLflow Integration
The pipeline uses MLflow for comprehensive experiment tracking:Experiment Setup
Tracked Parameters
All configuration parameters are logged:hidden_layersactivation_functionsdropout_ratelearning_rateepochsbatch_sizeconfig_file
Dependencies
Required Imports
Internal Modules
config.logs_configs.logging_config: Logging setupmodel.model: CreditScoreModel and ModelConfigprocessing.preprocessor: Data loading and preprocessing
Error Handling
The training pipeline logs all steps using the configured logger. Check logs for detailed error messages if training fails.
- Invalid configuration file path
- Missing dataset file
- Insufficient memory for batch size
- MLflow connection errors
- Invalid model configuration parameters
