Overview
STGNN supports two pretraining strategies for the GNN encoder:- Supervised pretraining (recommended) - Uses graph classification with labels
- Self-supervised pretraining (GraphCL-style) - Uses contrastive learning without labels
Supervised Pretraining (Recommended)
Supervised pretraining trains the GNN encoder on graph-level classification, which is more stable and avoids representation collapse compared to contrastive methods.Running Supervised Pretraining
- Load the FC graph dataset
- Split into train/validation (80/20 stratified split)
- Train a GNN encoder with a classification head for 100 epochs
- Save the pretrained encoder to
./model/pretrained_gnn_encoder.pth
supervised_pretrain.py
Architecture
The supervised pretraining uses:supervised_pretrain.py:157-177 for configuration.
Training Configuration
supervised_pretrain.py:56-62
Training Process
supervised_pretrain.py:33-138
Expected Performance
Typical supervised pretraining results:- Training accuracy: 70-85%
- Validation accuracy: 65-80%
- Converges in 30-50 epochs
- Best model selected by validation accuracy
Saved Model
The pretrained encoder is saved to:Self-Supervised Pretraining (GraphCL)
GraphCL-style contrastive pretraining creates augmented views of graphs and learns representations by maximizing agreement between augmentations.Enable in Training
dfc_main.py:40-43 for arguments.
Graph Augmentation
Two augmentation strategies are implemented:1. Drop Node
- Randomly drops 20% of nodes
- Updates edge indices accordingly
- Preserves graph structure
2. Drop Edge
- Randomly drops 20% of edges
- Simpler than node dropping
- Creates sparser connectivity
dfc_main.py:92-123
Contrastive Loss (NT-Xent)
- Temperature parameter: 0.5 (controls softness of softmax)
- Maximizes similarity between augmented views of same graph
- Minimizes similarity between different graphs
dfc_main.py:125-138
Pretraining Loop
dfc_main.py:141-179
Using Pretrained Encoders
Loading in Training Scripts
Bothmain.py and dfc_main.py automatically load pretrained encoders:
main.py:184-188 for static FC and dfc_main.py:267-283 for dynamic FC.
Freezing Encoder Weights
Optionally freeze the encoder during temporal training:- Only the temporal classifier (LSTM/GRU/RNN) is trained
- Encoder serves as fixed feature extractor
- Faster training, but may sacrifice adaptability
main.py:230-232 and dfc_main.py:335-338.
Flexible State Dict Loading
Theload_state_dict_flexible() method handles architecture mismatches:
- Loading pretrained weights into models with different heads
- Partial weight initialization
- Architecture modifications after pretraining
main.py:592 (encoder method)
Pretraining Comparison
| Aspect | Supervised | Self-Supervised (GraphCL) |
|---|---|---|
| Requires labels | Yes | No |
| Stability | High | Medium (can collapse) |
| Training time | Faster | Slower (2x augmentations) |
| Performance | Better for classification | Better for general features |
| Recommended for | STGNN (we have labels) | Unlabeled graph data |
Best Practices
- Use supervised pretraining for STGNN - it’s more stable and performs better
- Run pretraining once using
supervised_pretrain.py, then load weights for all experiments - Don’t freeze encoder unless you have very limited training data
- Use same architecture for pretraining and downstream tasks (hidden_dim, num_layers, etc.)
- Monitor validation accuracy during supervised pretraining - stop early if overfitting
Reproducibility
All pretraining methods set random seeds for reproducibility:- Model initialization
- Data loading
- Training loops
supervised_pretrain.py:23-31 and dfc_main.py:58-66.
Implementation Files
supervised_pretrain.py: Standalone supervised pretraining scriptdfc_main.py:89-179: GraphCL self-supervised pretrainingmain.py:184-188: Loading pretrained weights for static FCdfc_main.py:267-283: Loading pretrained weights for DFCmodel.py: GraphNeuralNetwork encoder architecture