Overview
GRUPredictor is a GRU-based model for processing temporal sequences of graph embeddings. It provides a lighter alternative to LSTM for temporal classification tasks.
Class Signature
Parameters
Dimension of graph embeddings from GNN encoder
Dimension of tabular embeddings. Set to
0 for graph-only modelsGRU hidden state dimension
Number of stacked GRU layers
Dropout probability (only applied if
num_layers > 1)Whether to use bidirectional GRU
Number of output classes
Forward Method
Parameters
Graph embedding sequence of shape
[batch_size, max_seq_len, graph_emb_dim]Optional tabular embedding sequence of shape
[batch_size, max_seq_len, tab_emb_dim]True sequence lengths, shape
[batch_size]. Used for packed sequencesAttention mask of shape
[batch_size, max_seq_len]Returns
Classification logits of shape
[batch_size, num_classes]Architecture Details
Input Fusion
GRU Processing
With packed sequences:Hidden State Extraction
Bidirectional:Classification Head
Example Usage
From main.py:267:Training Example
GRU vs LSTM
Advantages of GRU
- Fewer parameters (no separate cell state)
- Faster training and inference
- Less prone to overfitting on small datasets
- Simpler architecture
When to Use GRU
- Smaller datasets with limited temporal patterns
- When training speed is critical
- When LSTM overfits
- Shorter sequences (< 10 time steps)
Notes
- GRU has no cell state, only hidden state (unlike LSTM)
- GRU output dimension:
hidden_dim * (2 if bidirectional else 1) - Compatible with packed sequences for variable-length inputs
- Dropout only applied between layers when
num_layers > 1 - Typically trained with
bidirectional=Falsein STGNN framework