Best Model Performance
The best performing model uses a GraphSAGE-LSTM architecture with TopK pooling and focal loss, achieving the following results on the ADNI test set:| Metric | Score |
|---|---|
| Test Accuracy | 82.9% |
| Balanced Accuracy | 77.1% |
| AUC-ROC | 85.4% |
| Minority F1 (Converters) | Reported in detailed metrics |
Architecture Configuration
The best model configuration:Graph Neural Network
| Component | Value |
|---|---|
| Layer Type | GraphSAGE |
| Hidden Dimension | 256 |
| Number of Layers | 2 |
| Activation Function | ELU |
| Pooling Strategy | TopK (ratio=0.3) |
| Output Dimension | 256 |
| Dropout | 0.2 |
Temporal Model
| Component | Value |
|---|---|
| Model Type | LSTM |
| Hidden Dimension | 64 |
| Number of Layers | 1 |
| Bidirectional | Yes |
| Dropout | 0.45 |
| Max Visits | 10 |
Training Configuration
| Parameter | Value |
|---|---|
| Loss Function | Focal Loss |
| Focal Alpha | 0.90 |
| Focal Gamma | 3.0 |
| Label Smoothing | 0.05 |
| Learning Rate | 0.001 |
| Batch Size | 16 subjects |
| Epochs | 100 |
| Optimizer | Adam (β₁=0.9, β₂=0.999) |
| Weight Decay | 1e-4 |
| Gradient Clipping | max_norm=1.0 |
Training Strategy
Minority Class Handling
To address severe class imbalance (converters are minority class), the model employs multiple strategies:-
Focal Loss: Down-weights easy examples, focuses on hard cases
-
Minority Class Forcing: Additional loss term for first 20 epochs
- Stratified Cross-Validation: Ensures balanced class distribution across folds
Learning Rate Scheduling
ReduceLROnPlateau scheduler monitors validation balanced accuracy:Model Selection
Best model is selected based on validation AUC:unique_preds > 1) to be valid.
Cross-Validation Results
5-fold stratified cross-validation with 80/20 train-val split:Tracked Metrics
Across all folds, the following metrics are collected:| Metric | Description |
|---|---|
| test_acc | Test set overall accuracy |
| balanced_acc | Test set balanced accuracy |
| minority_f1 | Test set F1 for converter class |
| test_auc | Test set AUC-ROC |
| train_acc | Training set overall accuracy |
| balanced_train_acc | Training set balanced accuracy |
Data Split Strategy
Subject-Level Splitting
Data is split at the subject level (not visit level) to prevent data leakage:Stratification
Subjects are stratified by their conversion label to maintain class balance:Train-Val-Test Split
Each fold:- 64% training
- 16% validation
- 20% test
Reproducibility
All random seeds are set for reproducibility:- Before model creation
- Before each fold
- Before data loaders
- Before classifier initialization
- Before optimizer creation
Early Stopping
Patience-based early stopping monitors validation AUC:Model Checkpointing
Best model state is saved for each fold:Per-Fold Evaluation
Each fold reports:Conversion-Specific Performance
After all folds, aggregated conversion analysis shows per-group accuracy:Command to Reproduce
To reproduce the best results:Comparison with Baselines
The spatiotemporal approach outperforms:- Static GNN: Using only baseline visit (no temporal information)
- Simple RNN on tabular features: Without graph structure
- Traditional ML: SVM, Random Forest on hand-crafted features