Overview
RNNPredictor is a vanilla RNN-based model for temporal sequence processing. It provides the simplest recurrent architecture for baseline comparisons.
Class Signature
Parameters
Dimension of graph embeddings from GNN encoder
Dimension of tabular embeddings. Set to
0 for graph-only modelsRNN hidden state dimension
Number of stacked RNN layers
Dropout probability (only applied if
num_layers > 1)Whether to use bidirectional RNN
Number of output classes
Forward Method
Parameters
Graph embedding sequence of shape
[batch_size, max_seq_len, graph_emb_dim]Optional tabular embedding sequence of shape
[batch_size, max_seq_len, tab_emb_dim]True sequence lengths, shape
[batch_size]. Used for packed sequencesAttention mask of shape
[batch_size, max_seq_len]Returns
Classification logits of shape
[batch_size, num_classes]Architecture Details
RNN Configuration
Input Fusion
RNN Processing
With packed sequences:Hidden State Extraction
Bidirectional:Classification Head
Example Usage
From main.py:277:Training Example
Vanilla RNN Characteristics
Advantages
- Simplest recurrent architecture
- Fewest parameters
- Good baseline for comparison
- Fast training
Limitations
- Suffers from vanishing gradients
- Struggles with long-term dependencies
- Less expressive than LSTM/GRU
- Rarely outperforms gated variants
When to Use RNN
- As a baseline to validate that temporal modeling helps
- Very short sequences (< 5 time steps)
- When extreme computational efficiency is required
- When other models overfit
Comparison: RNN vs GRU vs LSTM
| Model | Parameters | Memory | Long-term | Speed |
|---|---|---|---|---|
| RNN | Fewest | Low | Poor | Fastest |
| GRU | Medium | Medium | Good | Fast |
| LSTM | Most | High | Best | Slower |
Notes
- Uses
tanhnonlinearity by default (can be changed torelu) - No gating mechanisms (unlike GRU/LSTM)
- RNN output dimension:
hidden_dim * (2 if bidirectional else 1) - Typically trained with
bidirectional=Falsein STGNN framework - Best suited for simple temporal patterns or as a baseline