Encoder & Decoder Architecture

Overview

The encoder network converts game observations into stimulation parameters (frequency and amplitude) for biological neurons. The decoder network reads spike features from neurons and outputs action logits. Together they form the biological neural interface.

Encoder Configuration

Trainability

encoder_trainable

bool

default:"True"

Whether the encoder weights are trainable via backpropagation.When True, the encoder learns to generate optimal stimulation parameters using Beta distributions. When False, uses fixed sigmoid-based mapping.

Code comment: “Can try turning it False but I would say True is needed for reasonable PPO policy gradients especially if decoder_use_mlp: False”

config = PPOConfig(
    encoder_trainable=True  # Enable encoder learning
)

Entropy Coefficient

encoder_entropy_coef

float

default:"-0.10"

Entropy penalty coefficient for encoder Beta distributions.Negative value acts as entropy penalty (encourages more deterministic stimulation). Positive values would encourage exploration in stimulation space.

config = PPOConfig(
    encoder_entropy_coef=-0.10  # Penalty for encoder randomness
)

CNN Visual Processing

encoder_use_cnn

bool

default:"True"

Enable CNN processing of visual screen buffer.When enabled, adds a convolutional neural network to process downsampled game screen before the encoder MLP.

Code comment: “With my testing it seems like the CNN does not overfit/learn on its own, seems useful to keep True”

encoder_cnn_channels

int

default:"16"

Base number of CNN channels in the first convolutional layer.The CNN architecture uses progressive channel expansion:

Layer 1: encoder_cnn_channels (default 16)
Layer 2: encoder_cnn_channels * 2 (default 32)
Layer 3: encoder_cnn_channels * 4 (default 64)

In training_server.py, this is increased to 64 channels per DOOM Initial Report for better visual feature extraction.

encoder_cnn_downsample

int

default:"4"

Downsampling factor for screen buffer before CNN processing.Original resolution is divided by this factor. For example, with 320×240 resolution and downsample=4, CNN processes 80×60 images.

config = PPOConfig(
    encoder_use_cnn=True,
    encoder_cnn_channels=64,    # Increased capacity
    encoder_cnn_downsample=4    # 4x downsampling
)

CNN Architecture Details

The encoder CNN uses the following architecture:

nn.Sequential(
    # Layer 1: base_channels filters
    nn.Conv2d(1, base_channels, kernel_size=3, stride=1, padding=1),
    nn.SiLU(),
    nn.MaxPool2d(2),
    
    # Layer 2: base_channels * 2 filters
    nn.Conv2d(base_channels, base_channels * 2, kernel_size=3, stride=1, padding=1),
    nn.SiLU(),
    nn.MaxPool2d(2),
    
    # Layer 3: base_channels * 4 filters
    nn.Conv2d(base_channels * 2, base_channels * 4, kernel_size=3, stride=1, padding=1),
    nn.SiLU(),
    nn.AdaptiveAvgPool2d((1, 1))  # Global pooling to single vector
)

Decoder Configuration

Architecture Type

decoder_use_mlp

bool

default:"False"

Use MLP decoder instead of linear readout heads.

False: Direct linear readout from spike features (recommended)
True: 2-layer MLP processes spikes before action heads

Code comment: “Prefer to be false, causes decoder to learn how to play the game but was tested on random spikes, could be different in prod”

decoder_mlp_hidden

int

default:"32"

Hidden layer size when decoder_use_mlp=True.In ppo_doom.py: default is 32 In training_server.py: increased to 256

# Linear decoder (recommended)
config = PPOConfig(
    decoder_use_mlp=False
)

# MLP decoder (experimental)
config = PPOConfig(
    decoder_use_mlp=True,
    decoder_mlp_hidden=256
)

Weight Constraints

decoder_enforce_nonnegative

bool

default:"False"

Enforce non-negative weights in decoder linear readout heads.When True, applies softplus activation to weights: weight = softplus(raw_weight)This ensures all spike contributions are positive, which can be biologically interpretable.

decoder_freeze_weights

bool

default:"False"

Freeze all decoder parameters (no gradient updates).Useful for testing whether the encoder alone can learn, or for transfer learning scenarios.

decoder_zero_bias

bool

default:"True"

Force decoder bias terms to zero and disable bias gradients.

Code comment: “Prefer to be true, needs testing, bias tends to cause the decoder to generate its own predictions for movement”

Setting bias to zero ensures actions are driven entirely by spike activity, not learned biases.

config = PPOConfig(
    decoder_enforce_nonnegative=False,  # Allow negative weights
    decoder_freeze_weights=False,       # Train decoder
    decoder_zero_bias=True              # Force zero bias
)

L2 Regularization

decoder_weight_l2_coef

float

default:"0.0"

L2 regularization coefficient for decoder weights.Penalizes large weights to encourage simpler linear readouts. Currently untuned (set to 0.0).

decoder_bias_l2_coef

float

default:"0.0"

L2 regularization coefficient for decoder biases.Currently untuned (set to 0.0).

config = PPOConfig(
    decoder_weight_l2_coef=0.001,  # Add weight regularization
    decoder_bias_l2_coef=0.0
)

Ablation Testing

decoder_ablation_mode

str

default:"'none'"

Ablation mode for testing decoder learning.

'none': Normal operation, use real spike features
'zero': Replace spike features with zeros
'random': Replace spike features with random values

Used to test if decoder is learning on its own vs. relying on neural activity.

config = PPOConfig(
    decoder_ablation_mode='zero'  # Test with no neural input
)

Network Architecture

Hidden Layer Size

hidden_size

int

default:"128"

Hidden layer size for encoder, decoder MLP, and value network.Used across all network components for consistency.

config = PPOConfig(
    hidden_size=256  # Increase model capacity
)

Example Configurations

Minimal Linear Decoder

# Pure linear readout from spikes - maximum biological interpretability
minimal_config = PPOConfig(
    encoder_trainable=True,
    encoder_use_cnn=False,
    encoder_entropy_coef=-0.10,
    decoder_use_mlp=False,
    decoder_enforce_nonnegative=True,   # Positive weights only
    decoder_zero_bias=True,             # No bias
    decoder_freeze_weights=False,
    hidden_size=128
)

CNN-Based Encoder

# Visual processing with CNN encoder
visual_config = PPOConfig(
    encoder_trainable=True,
    encoder_use_cnn=True,
    encoder_cnn_channels=64,            # High capacity
    encoder_cnn_downsample=4,
    encoder_entropy_coef=-0.10,
    decoder_use_mlp=False,
    decoder_zero_bias=True,
    hidden_size=256
)

MLP Decoder (Experimental)

# Non-linear decoder with MLP
mlp_config = PPOConfig(
    encoder_trainable=True,
    encoder_use_cnn=True,
    encoder_cnn_channels=32,
    decoder_use_mlp=True,
    decoder_mlp_hidden=256,
    decoder_zero_bias=False,            # MLP can use bias
    decoder_enforce_nonnegative=False,
    hidden_size=128
)

Frozen Decoder Testing

# Test encoder learning with fixed decoder
frozen_config = PPOConfig(
    encoder_trainable=True,
    encoder_use_cnn=True,
    decoder_use_mlp=False,
    decoder_freeze_weights=True,        # Freeze decoder
    decoder_zero_bias=True,
    hidden_size=128
)

PPO Hyperparameters

Learning rate, gamma, GAE settings

Feedback Tuning

Stimulation feedback parameters

Get Started

Core Concepts

Guides

Configuration

Advanced

Encoder & Decoder Architecture

Overview

Encoder Configuration

Trainability

Entropy Coefficient

CNN Visual Processing

CNN Architecture Details

Decoder Configuration

Architecture Type

Weight Constraints

L2 Regularization

Ablation Testing

Network Architecture

Hidden Layer Size

Example Configurations

Minimal Linear Decoder

CNN-Based Encoder

MLP Decoder (Experimental)

Frozen Decoder Testing

PPO Hyperparameters

Feedback Tuning

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Configuration

Advanced

​Overview

​Encoder Configuration

​Trainability

​Entropy Coefficient

​CNN Visual Processing

​CNN Architecture Details

​Decoder Configuration

​Architecture Type

​Weight Constraints

​L2 Regularization

​Ablation Testing

​Network Architecture

​Hidden Layer Size

​Example Configurations

​Minimal Linear Decoder

​CNN-Based Encoder

​MLP Decoder (Experimental)

​Frozen Decoder Testing

​Related Configuration

PPO Hyperparameters

Feedback Tuning

Build docs developers (and LLMs) love

Overview

Encoder Configuration

Trainability

Entropy Coefficient

CNN Visual Processing

CNN Architecture Details

Decoder Configuration

Architecture Type

Weight Constraints

L2 Regularization

Ablation Testing

Network Architecture

Hidden Layer Size

Example Configurations

Minimal Linear Decoder

CNN-Based Encoder

MLP Decoder (Experimental)

Frozen Decoder Testing

Related Configuration