Quickstart

Overview

This quickstart guide will help you set up the Credit Score AI Engine, train your first deep learning model for credit risk prediction, and deploy it as a production-ready API. The Credit Score AI Engine is built with PyTorch and demonstrates MLOps best practices for the complete AI lifecycle: from data preprocessing to model deployment.

What you’ll build: A credit risk prediction system that evaluates loan applicants and returns risk scores (good/bad) with confidence probabilities.

Prerequisites

Before you begin, ensure you have:

Python 3.10+ installed
Git for version control
UV package manager (recommended) or pip
Docker (optional, for containerized deployment)

Installation

Clone the repository

First, clone the data science services repository:

git clone <repository-url>
cd data_science_services/python-projects/credit-score

Install dependencies

Using UV (recommended for faster installation):

uv sync

Or using pip:

pip install -r requirements.txt

The project uses modern Python packaging with pyproject.toml and uv.lock for deterministic builds.

Key dependencies installed:

torch==2.10.0 - Deep learning framework
fastapi==0.128.0 - High-performance API framework
scikit-learn==1.7.2 - Data preprocessing
pydantic==2.12.5 - Data validation
mlflow - Experiment tracking

Start MLflow tracking server

Launch MLflow UI to monitor training metrics in real-time:

uv run mlflow ui

Access the dashboard at: http://127.0.0.1:5000 MLflow Dashboard

Environment Setup

Project Structure

Understand the key directories:

credit-score/
├── config/
│   └── models-configs/          # Model hyperparameters (YAML)
├── model/
│   └── model.py                 # PyTorch neural network architecture
├── processing/
│   └── preprocessor.py          # Data transformation pipeline
├── training/
│   └── training.py              # Training orchestration
├── inference/
│   └── inference.py             # Inference engine (singleton)
├── server/
│   ├── api.py                   # FastAPI REST endpoints
│   └── schemas.py               # Pydantic data contracts
└── examples/
    └── client_web/              # Demo web interface

Configuration Files

Model configurations are defined in YAML for experiment flexibility:

config/models-configs/model_config_001.yaml

hyperparameters:
  learning_rate: 0.001
  batch_size: 64
  epochs: 20
  dropout_rate: 0.2
  hidden_layers: [128, 64, 32]
  activation_functions: ["relu", "relu", "relu"]

Never hardcode hyperparameters in training code. Always use configuration files for reproducibility.

Training Your First Model

Prepare the dataset

The German Credit Risk dataset is managed with DVC (Data Version Control):

# Navigate to datasets directory
cd ../../../datasets/credit_score_dataset/

# Pull the latest dataset version
dvc pull german_credit_risk_v1.0.0_training_23012026.csv.dvc

The dataset includes features like:

Age, Sex, Job level
Housing status
Saving/Checking accounts
Credit amount and duration
Loan purpose

Run the training pipeline

Navigate back to the project and execute training:

cd ../python-projects/credit-score/
uv run training/training.py --config config/models-configs/model_config_001.yaml

What happens during training:

# Load and preprocess data
df = load_data(dataset_path)
X_train, X_test, y_train, y_test = preprocess_data(df, save_path=preprocessor_path)

# Convert to PyTorch tensors
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train.values, dtype=torch.float32).unsqueeze(1)

# Initialize model
model = CreditScoreModel(model_config)

# Training loop
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.AdamW(model.parameters(), lr=config["learning_rate"])

for epoch in range(epochs):
    model.train()
    for inputs, labels in train_loader:
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

Monitor training progress

Watch metrics in real-time via MLflow UI:

Train Loss & Accuracy - Per epoch
Test Metrics - ROC AUC, Precision, Recall, F1
Visualizations - Confusion matrix, ROC curve, Precision-Recall curve

All artifacts are automatically logged:

mlflow.log_metric("train_loss", epoch_loss, step=epoch)
mlflow.log_metric("test_roc_auc", roc_auc)
mlflow.log_figure(plt.gcf(), "confusion_matrix.png")
mlflow.log_artifact(model_save_path)

Verify model artifacts

After training completes, verify saved artifacts:

# Model weights
ls model/model_weights_001.pth

# Preprocessor (for inference)
ls processing/preprocessor.joblib

# MLflow experiments
ls mlruns/

Running Inference

Start the API Server

Launch the FastAPI inference server:

uv run uvicorn server.api:app --reload --port 8000

The API automatically loads the trained model using a singleton pattern for optimal performance.

API Documentation

Access interactive Swagger docs at: http://localhost:8000/docs

Make Predictions

curl -X POST "http://localhost:8000/credit_score_prediction" \
  -H "Content-Type: application/json" \
  -d '{
    "Age": 35,
    "Sex": "male",
    "Job": "skilled",
    "Housing": "own",
    "Saving accounts": "NA",
    "Checking account": "little",
    "Credit amount": 9055.0,
    "Duration": 36,
    "Purpose": "education"
  }'

API Response

{
  "prediction": "good",
  "probability": 0.8523
}

Response fields:

prediction: Credit risk assessment ("good" or "bad")
probability: Confidence score for “good” risk (0.0 to 1.0)

Web Interface Demo

Launch the interactive web client for testing:

uv run uvicorn examples.client_web.main:app --reload --port 3000

Access at: http://localhost:3000 The web interface provides:

Form-based input for credit applications
Real-time predictions
Visual risk indicators

Docker Deployment (Production)

For production deployment with Docker:

Build and start services

docker-compose up --build

This orchestrates:

API Service (port 8000)
Web Client (port 3000)

Verify services

# Check running containers
docker-compose ps

# View API logs
docker-compose logs api

Access services

API: http://localhost:8000/docs
Web App: http://localhost:3000

Docker images use multi-stage builds for optimized production sizes.

Model Architecture Overview

The PyTorch model implements a deep neural network:

model/model.py (excerpt)

class CreditScoreModel(nn.Module):
    def __init__(self, config: ModelConfig):
        super().__init__()
        layers = []
        
        # Build hidden layers
        for hidden_dim, act_fn in zip(config.hidden_layers, config.activation_functions):
            layers.append(nn.Linear(in_dim, hidden_dim))
            layers.append(nn.BatchNorm1d(hidden_dim))  # Stabilizes training
            layers.append(get_activation(act_fn))      # ReLU/GELU/etc.
            layers.append(nn.Dropout(config.dropout_rate))  # Prevents overfitting
            in_dim = hidden_dim
        
        # Output layer
        layers.append(nn.Linear(in_dim, 1))
        self.model = nn.Sequential(*layers)

Key features:

Batch Normalization - Accelerates convergence
Dropout Regularization - Reduces overfitting
Configurable Activations - ReLU, GELU, LeakyReLU support
He/Xavier Initialization - Optimal weight initialization per activation type

Next Steps

Model Configuration

Learn how to tune hyperparameters and experiment with different architectures

API Reference

Explore all available API endpoints and schemas

Production Deployment

Deploy to cloud platforms with best practices

MLOps Architecture

Understand the MLOps principles and architecture patterns

Troubleshooting

ModuleNotFoundError during training

Ensure you’re in the correct directory and all dependencies are installed:

cd python-projects/credit-score/
uv sync

FileNotFoundError: Dataset not found

Pull the dataset using DVC:

cd datasets/credit_score_dataset/
dvc pull

CUDA/GPU not detected

The model runs on CPU by default. For GPU acceleration:

pip install torch --index-url https://download.pytorch.org/whl/cu118

Then modify inference/inference.py:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

API returns 500 error

Check that model artifacts exist:

ls model/model_weights_001.pth
ls processing/preprocessor.joblib

If missing, run training first.

Support

Need help? Check out these resources:

Video Tutorial: Project Explanation & Demo
Deployment Guide: Production Deployment Tutorial
GitHub Issues: Report bugs or request features
Community: Join our Discord for discussions

Congratulations! You’ve successfully set up, trained, and deployed a production-ready credit scoring AI system.

Get Started

Core Concepts

Guides

Use Cases

Overview

Prerequisites

Installation

Environment Setup

Project Structure

Configuration Files

Training Your First Model

Running Inference

Start the API Server

API Documentation

Make Predictions

API Response

Web Interface Demo

Docker Deployment (Production)

Model Architecture Overview

Next Steps

Model Configuration

API Reference

Production Deployment

MLOps Architecture

Troubleshooting

Support

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Use Cases

​Overview

​Prerequisites

​Installation

​Environment Setup

​Project Structure

​Configuration Files

​Training Your First Model

​Running Inference

​Start the API Server

​API Documentation

​Make Predictions

​API Response

​Web Interface Demo

​Docker Deployment (Production)

​Model Architecture Overview

​Next Steps

Model Configuration

API Reference

Production Deployment

MLOps Architecture

​Troubleshooting

​Support

Build docs developers (and LLMs) love

Overview

Prerequisites

Installation

Environment Setup

Project Structure

Configuration Files

Training Your First Model

Running Inference

Start the API Server

API Documentation

Make Predictions

API Response

Web Interface Demo

Docker Deployment (Production)

Model Architecture Overview

Next Steps

Troubleshooting

Support