Skip to main content

Overview

This quickstart guide will help you set up the Credit Score AI Engine, train your first deep learning model for credit risk prediction, and deploy it as a production-ready API. The Credit Score AI Engine is built with PyTorch and demonstrates MLOps best practices for the complete AI lifecycle: from data preprocessing to model deployment.
What you’ll build: A credit risk prediction system that evaluates loan applicants and returns risk scores (good/bad) with confidence probabilities.

Prerequisites

Before you begin, ensure you have:
  • Python 3.10+ installed
  • Git for version control
  • UV package manager (recommended) or pip
  • Docker (optional, for containerized deployment)

Installation

1

Clone the repository

First, clone the data science services repository:
git clone <repository-url>
cd data_science_services/python-projects/credit-score
2

Install dependencies

Using UV (recommended for faster installation):
uv sync
Or using pip:
pip install -r requirements.txt
The project uses modern Python packaging with pyproject.toml and uv.lock for deterministic builds.
Key dependencies installed:
  • torch==2.10.0 - Deep learning framework
  • fastapi==0.128.0 - High-performance API framework
  • scikit-learn==1.7.2 - Data preprocessing
  • pydantic==2.12.5 - Data validation
  • mlflow - Experiment tracking
3

Start MLflow tracking server

Launch MLflow UI to monitor training metrics in real-time:
uv run mlflow ui
Access the dashboard at: http://127.0.0.1:5000MLflow Dashboard

Environment Setup

Project Structure

Understand the key directories:
credit-score/
├── config/
│   └── models-configs/          # Model hyperparameters (YAML)
├── model/
│   └── model.py                 # PyTorch neural network architecture
├── processing/
│   └── preprocessor.py          # Data transformation pipeline
├── training/
│   └── training.py              # Training orchestration
├── inference/
│   └── inference.py             # Inference engine (singleton)
├── server/
│   ├── api.py                   # FastAPI REST endpoints
│   └── schemas.py               # Pydantic data contracts
└── examples/
    └── client_web/              # Demo web interface

Configuration Files

Model configurations are defined in YAML for experiment flexibility:
config/models-configs/model_config_001.yaml
hyperparameters:
  learning_rate: 0.001
  batch_size: 64
  epochs: 20
  dropout_rate: 0.2
  hidden_layers: [128, 64, 32]
  activation_functions: ["relu", "relu", "relu"]
Never hardcode hyperparameters in training code. Always use configuration files for reproducibility.

Training Your First Model

1

Prepare the dataset

The German Credit Risk dataset is managed with DVC (Data Version Control):
# Navigate to datasets directory
cd ../../../datasets/credit_score_dataset/

# Pull the latest dataset version
dvc pull german_credit_risk_v1.0.0_training_23012026.csv.dvc
The dataset includes features like:
  • Age, Sex, Job level
  • Housing status
  • Saving/Checking accounts
  • Credit amount and duration
  • Loan purpose
2

Run the training pipeline

Navigate back to the project and execute training:
cd ../python-projects/credit-score/
uv run training/training.py --config config/models-configs/model_config_001.yaml
What happens during training:
# Load and preprocess data
df = load_data(dataset_path)
X_train, X_test, y_train, y_test = preprocess_data(df, save_path=preprocessor_path)

# Convert to PyTorch tensors
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train.values, dtype=torch.float32).unsqueeze(1)

# Initialize model
model = CreditScoreModel(model_config)

# Training loop
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.AdamW(model.parameters(), lr=config["learning_rate"])

for epoch in range(epochs):
    model.train()
    for inputs, labels in train_loader:
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
3

Monitor training progress

Watch metrics in real-time via MLflow UI:
  • Train Loss & Accuracy - Per epoch
  • Test Metrics - ROC AUC, Precision, Recall, F1
  • Visualizations - Confusion matrix, ROC curve, Precision-Recall curve
All artifacts are automatically logged:
mlflow.log_metric("train_loss", epoch_loss, step=epoch)
mlflow.log_metric("test_roc_auc", roc_auc)
mlflow.log_figure(plt.gcf(), "confusion_matrix.png")
mlflow.log_artifact(model_save_path)
4

Verify model artifacts

After training completes, verify saved artifacts:
# Model weights
ls model/model_weights_001.pth

# Preprocessor (for inference)
ls processing/preprocessor.joblib

# MLflow experiments
ls mlruns/

Running Inference

Start the API Server

Launch the FastAPI inference server:
uv run uvicorn server.api:app --reload --port 8000
The API automatically loads the trained model using a singleton pattern for optimal performance.

API Documentation

Access interactive Swagger docs at: http://localhost:8000/docs

Make Predictions

curl -X POST "http://localhost:8000/credit_score_prediction" \
  -H "Content-Type: application/json" \
  -d '{
    "Age": 35,
    "Sex": "male",
    "Job": "skilled",
    "Housing": "own",
    "Saving accounts": "NA",
    "Checking account": "little",
    "Credit amount": 9055.0,
    "Duration": 36,
    "Purpose": "education"
  }'

API Response

{
  "prediction": "good",
  "probability": 0.8523
}
Response fields:
  • prediction: Credit risk assessment ("good" or "bad")
  • probability: Confidence score for “good” risk (0.0 to 1.0)

Web Interface Demo

Launch the interactive web client for testing:
uv run uvicorn examples.client_web.main:app --reload --port 3000
Access at: http://localhost:3000 The web interface provides:
  • Form-based input for credit applications
  • Real-time predictions
  • Visual risk indicators

Docker Deployment (Production)

For production deployment with Docker:
1

Build and start services

docker-compose up --build
This orchestrates:
  • API Service (port 8000)
  • Web Client (port 3000)
2

Verify services

# Check running containers
docker-compose ps

# View API logs
docker-compose logs api
3

Access services

  • API: http://localhost:8000/docs
  • Web App: http://localhost:3000
Docker images use multi-stage builds for optimized production sizes.

Model Architecture Overview

The PyTorch model implements a deep neural network:
model/model.py (excerpt)
class CreditScoreModel(nn.Module):
    def __init__(self, config: ModelConfig):
        super().__init__()
        layers = []
        
        # Build hidden layers
        for hidden_dim, act_fn in zip(config.hidden_layers, config.activation_functions):
            layers.append(nn.Linear(in_dim, hidden_dim))
            layers.append(nn.BatchNorm1d(hidden_dim))  # Stabilizes training
            layers.append(get_activation(act_fn))      # ReLU/GELU/etc.
            layers.append(nn.Dropout(config.dropout_rate))  # Prevents overfitting
            in_dim = hidden_dim
        
        # Output layer
        layers.append(nn.Linear(in_dim, 1))
        self.model = nn.Sequential(*layers)
Key features:
  • Batch Normalization - Accelerates convergence
  • Dropout Regularization - Reduces overfitting
  • Configurable Activations - ReLU, GELU, LeakyReLU support
  • He/Xavier Initialization - Optimal weight initialization per activation type

Next Steps

Model Configuration

Learn how to tune hyperparameters and experiment with different architectures

API Reference

Explore all available API endpoints and schemas

Production Deployment

Deploy to cloud platforms with best practices

MLOps Architecture

Understand the MLOps principles and architecture patterns

Troubleshooting

Ensure you’re in the correct directory and all dependencies are installed:
cd python-projects/credit-score/
uv sync
Pull the dataset using DVC:
cd datasets/credit_score_dataset/
dvc pull
The model runs on CPU by default. For GPU acceleration:
pip install torch --index-url https://download.pytorch.org/whl/cu118
Then modify inference/inference.py:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
Check that model artifacts exist:
ls model/model_weights_001.pth
ls processing/preprocessor.joblib
If missing, run training first.

Support

Need help? Check out these resources:
Congratulations! You’ve successfully set up, trained, and deployed a production-ready credit scoring AI system.

Build docs developers (and LLMs) love