Overview
The feedback system delivers stimulation to biological neurons in response to game events (kills, damage, pickups). Each event has configurable base parameters and surprise-based scaling that modulates feedback intensity based on temporal difference (TD) errors.EventFeedbackConfig
Each event type (enemy kill, took damage, armor pickup, etc.) is configured using theEventFeedbackConfig dataclass.
Base Stimulation Parameters
List of neural channel indices to stimulate for this event.Channels must be in range 0-63 and not overlap with other event or action channels.
Base stimulation frequency in Hz before surprise scaling.Typical ranges:
- Positive events: 20-40 Hz
- Negative events: 60-120 Hz
Base stimulation amplitude in microamperes (μA) before surprise scaling.Typical range: 1.8-2.5 μA
Base number of pulses per feedback burst.Typical ranges:
- Quick events: 25-35 pulses
- Important events: 40-50 pulses
Event Metadata
Key name in the environment’s info dict that tracks this event.Examples:
'event_enemy_kill', 'event_took_damage', 'event_armor_pickup'Expected sign of temporal difference error for this event.
'positive': Event represents reward (kills, pickups)'negative': Event represents punishment (damage, waste)'absolute': Use absolute value of TD error
Surprise Scaling Parameters
Feedback intensity scales based on TD error magnitude (“surprise”). Larger unexpected rewards/punishments trigger stronger feedback.Gain coefficient for frequency scaling based on surprise.
scaled_freq = base_freq * (1 + freq_gain * surprise_factor)Maximum scaling multiplier for frequency.Frequency is clipped to
[base_freq, base_freq * freq_max_scale]Gain coefficient for amplitude scaling based on surprise.
scaled_amp = base_amp * (1 + amp_gain * surprise_factor)Maximum scaling multiplier for amplitude.Amplitude is clipped to
[base_amp, base_amp * amp_max_scale]Gain coefficient for pulse count scaling based on surprise.
scaled_pulses = base_pulses * (1 + pulse_gain * surprise_factor)Maximum scaling multiplier for pulse count.Pulse count is clipped to
[base_pulses, base_pulses * pulse_max_scale]Exponential Moving Average
Beta parameter for exponential moving average of surprise magnitude.
surprise_ema = ema_beta * surprise_ema + (1 - ema_beta) * |td_error|Higher values (closer to 1.0) create slower-moving averages.Unpredictable Stimulation
Some events (like taking damage) can trigger additional unpredictable background stimulation.Enable unpredictable background stimulation for this event.
Frequency in Hz for unpredictable stimulation bursts.
Duration in seconds for each unpredictable stimulation burst.
Rest period in seconds between unpredictable bursts.
Channels to use for unpredictable stimulation. If
None, uses same channels as main event.Amplitude for unpredictable stimulation. If
None, uses base_amplitude.Default Event Configurations
Enemy Kill (Positive Event)
Armor Pickup (Positive Event)
Took Damage (Negative Event)
Ammo Waste (Negative Event)
Approach Target (Positive Event, ppo_doom.py only)
Retreat from Target (Negative Event, ppo_doom.py only)
Global Feedback Settings
Reward-Based Feedback
Enable continuous feedback based on TD error magnitude.
Channels for positive TD error feedback.
Channels for negative TD error feedback.
TD error threshold for triggering positive feedback.
TD error threshold for triggering negative feedback.
Positive Feedback Parameters
Amplitude in μA for positive reward feedback.
Frequency in Hz for positive reward feedback.
Number of pulses for positive reward feedback.
Negative Feedback Parameters
Amplitude in μA for negative reward feedback.
Frequency in Hz for negative reward feedback (higher than positive).
Number of pulses for negative reward feedback (longer than positive).
Episode-Level Feedback
Enable episode-end feedback stimulation. Only available in training_server.py.
If
True, disable step-level feedback and only provide episode-end feedback.Scale episode feedback by surprise magnitude. Only available in training_server.py.
Pulses for positive episode-end feedback.
Frequency for positive episode-end feedback.
Pulses for negative episode-end feedback.
Frequency for negative episode-end feedback.
Global Surprise Scaling
Global gain for surprise-based feedback scaling.
Code comment: “Tune as needed, will depend on neurons”
Maximum global surprise scaling multiplier.
Frequency-specific surprise gain (overrides
feedback_surprise_gain for frequency).Amplitude-specific surprise gain.
Frequency-specific max scaling.
Amplitude-specific max scaling.
Example: Custom Event Feedback
Related Configuration
PPO Hyperparameters
Learning rate and training settings
Encoder/Decoder
Network architecture configuration