Quick start
This guide will help you set up your data and run your first STGNN training session.Prerequisites
Before you begin, ensure you have:- Completed the installation steps
- Access to ADNI fMRI data or similar functional connectivity data
- At least 16GB RAM (32GB recommended for larger datasets)
Data setup
STGNN requires functional connectivity matrices and patient labels organized in a specific structure.Organize FC matrices
Place your functional connectivity (FC) matrices in the File structure:
Each
data/FC_Matrices/ directory.File naming format:.npz file should contain a key named fc_matrix with the connectivity matrix.Prepare patient labels
Create the temporal labels file:
data/TADPOLE_TEMPORAL.csvRequired columns:Subject: Patient identifier (e.g., “sub-123456”)Visit: Visit identifier or codeLabel_CS_Num: Binary label (0=Stable, 1=Converter)Visit_Order: Sequential visit number (1, 2, 3, …)Months_From_Baseline: Months since first visitMonths_To_Next_Original: Months until next visit (-1 if last visit)
Generate temporal data (optional)
If you have original TADPOLE data files, generate the temporal dataset:- Loads TADPOLE_COMPLETE.csv and TADPOLE_Simplified.csv
- Calculates temporal gaps between visits
- Creates sequential visit orders
- Generates TADPOLE_TEMPORAL.csv with all required columns
Run your first training
Now you’re ready to train your first model!Basic training
Run with default settings (GraphSAGE-LSTM with TopK pooling and focal loss):The default configuration uses:
- GNN: GraphSAGE with 2 layers, 256 hidden dimensions
- Temporal model: Bidirectional LSTM with 64 hidden dimensions
- Pooling: TopK pooling (ratio=0.3)
- Training: 100 epochs, batch size 16, 5-fold cross-validation
Training with custom parameters
Customize the architecture and training parameters:Time-aware prediction (experimental)
Enable temporal gap features for time-aware prediction:Understanding the output
During training, you’ll see output like this:Key metrics explained
- Test Accuracy: Overall classification accuracy
- Balanced Accuracy: Average of per-class recall (accounts for class imbalance)
- Minority F1: F1 score for the converter class (minority class)
- AUC: Area under the ROC curve (model’s ability to discriminate between classes)
Final cross-validation results
After all folds complete, you’ll see aggregated results:Saved models
Trained models are saved in themodel/ directory:
- Trained encoder (GNN) state dict
- Trained classifier (LSTM/GRU/RNN) state dict
- Best epoch number
- Validation metrics
Common configuration examples
Best performance (default)
GraphSAGE-LSTM configuration (82.9% test accuracy):Fast training
Quick experimentation with smaller model:Memory-efficient
For systems with limited GPU memory:Next steps
Core concepts
Learn about the architecture and methodology
Configuration reference
Explore all available configuration options
Model architecture
Understand the GNN and temporal components
Advanced usage
Pretraining, transfer learning, and hyperparameter tuning