Overview
Pulse uses online learning to continuously improve its relevance predictions. Unlike batch training, online learning updates models incrementally with each new labeled example, adapting to user behavior over time without requiring large datasets or retraining cycles.How Online Learning Works
The training loop operates in three stages:1. Activation Recording
When Layer 2 scores an event above threshold and Layer 3 escalates to the kernel, Pulse records an activation:module_id: Which module triggeredwindow: TheSignalEventobjects that contributedtimestamp: When the activation occurredlabel: InitiallyNone, filled in later
2. Feedback Collection
Pulse waits up to 5 minutes for feedback. Feedback can be: Explicit (preferred):The implicit labeling heuristic is basic in the current implementation. It assumes activations with non-empty windows are useful (0.8) and those without are not (0.2). Future versions will inspect agent output to determine actual usefulness.
3. Weight Updates
Periodically (2 seconds after each activation), Pulse drains the training buffer:drain() method:
- Finds all activations with labels (explicit or inferred)
- Calls
limbic.update_weights(module_id, window, label)for each - Removes processed activations from the buffer
def on_escalation(decision: EscalationDecision):
# The decision contains the triggering events
# You need to track which activation_id corresponds to this decision
# This is typically done in the kernel layer
# Process the activation
result = agent.handle(decision.question)
# Later, provide feedback
pulse.record_feedback(activation_id, label=calculate_label(result))
The current architecture requires the kernel to track activation IDs. The
EscalationDecision object doesn’t include the activation ID directly. This is a design consideration for future enhancement.See pulse/prefrontal.py for the EscalationDecision structure.def calculate_label(result):
"""
Example heuristic based on agent output.
"""
if result.action_taken:
# Agent took action (wrote memory, ran tool)
return 1.0
elif result.confidence_score > 0.7:
# Agent was confident but took no action
return 0.5
else:
# Agent was not confident or dismissed the question
return 0.0
1.0: Highly relevant, exactly what the user wanted0.7–0.9: Relevant and useful0.3–0.6: Somewhat relevant but not critical0.0–0.2: Not relevant, false positiveUnderstanding the Training Buffer
TheTrainingBuffer class (pulse/training.py) manages the lifecycle of activation records:
| State | Label | Age | Action |
|---|---|---|---|
| Waiting | None | < 5 min | Keep in buffer |
| Labeled | Set | Any | Train and remove |
| Expired | None | ≥ 5 min | Infer label, train, remove |
pulse/training.py:71-95 for the drain logic.
Model Architecture
Each cluster uses a small LSTM or Temporal Convolutional Network (TCN):- Input: Sliding window of recent
SignalEventfeature vectors (16-dimensional) - Hidden size: 32–64 units
- Output: Single float (relevance score, 0.0–1.0) via sigmoid
- Parameters: ~50k–200k (well under 1M)
- Inference time: < 5ms on CPU
~/.macroa/pulse/models/ and automatically saved when pulse.stop() is called.
Cold Start: Synthetic Priors
On day one, before any real data exists, Pulse generates synthetic training examples from the fingerprint to initialize model weights:pulse/fingerprint.py:133-179):
1.0: Feature is directly relevant to this module0.5: Feature is weakly relevant0.0: Feature is not relevant
Monitoring Training Progress
While Pulse doesn’t expose a training metrics API in the current implementation, you can observe:- Escalation frequency: If the model learns well, false positives decrease over time
- Relevance scores: Check
decision.confidencein your escalation handler - Model files: Watch for updates in
~/.macroa/pulse/models/
Advanced: Manual Weight Updates
If you want to train on historical data or batch examples:Troubleshooting
ValueError: label must be in [0.0, 1.0]
ValueError: label must be in [0.0, 1.0]
Labels must be floats between 0.0 and 1.0 inclusive. Check your label calculation logic.
Feedback seems to have no effect
Feedback seems to have no effect
Check timing. Feedback must arrive within 5 minutes of activation, and the training drain happens 2 seconds after activation. If you submit feedback after the drain, it won’t be processed.To force immediate training:
Where are model weights stored?
Where are model weights stored?
Model weights are saved to If
model_save_path when pulse.stop() is called:model_save_path is None, weights are not persisted.Can I retrain a model from scratch?
Can I retrain a model from scratch?
Yes, delete the model file for that cluster:When the cluster is next registered, it will reinitialize with synthetic priors from the fingerprint.
Next Steps
Monitoring Signals
Subscribe to events and debug signal flow
Integration
Integrate Pulse into your agent system