Learning System

CareSupport is part of a learning system. Conversations are reviewed, corrections are captured, and lessons are promoted into permanent instructions. Here’s how it works.

Learning Architecture

Conversations happen

The agent responds to messages using its current context (SOUL.md, skills, lessons.md).

Self-corrections are captured

When a user corrects the agent, it records the correction in the self_corrections field. This is written to the family’s lessons.md immediately — you see it in context on the next message.

Review process analyzes behavior

review_loop.py reads conversation logs and checks agent behavior against skill guidance. It produces findings and lesson recommendations.

Lessons are promoted

Approved lessons are promoted to lessons.md (global) or families/{id}/lessons.md (family-specific). These are loaded into every future prompt.

From SOUL.md: “The LESSONS sections in your context come from that review process. They aren’t rules handed down — they’re corrections earned from real conversations.”

lessons.md Format

Lessons are stored as timestamped bullet points with optional category tags:

# Lessons
<!-- Corrections from conversations. Loaded into every prompt. Max 20 entries. -->
- [2026-02-26] Liban is Degitu's grandson, not the other way around. Correct this immediately in all future references.
- [2026-02-26] Don't claim to know family relationships I haven't confirmed. Ask directly instead of inferring.
- [2026-02-27] Never return an empty sms_response. Every inbound message deserves a reply.

Lesson Categories

The system recognizes three categories (optional tags):

[behavioral]
[factual]
[operational]

How to communicate, when to ask, how to handle corrections.Example: “One question per response, always. Never stack multiple questions.”

Eviction Strategy

When lessons exceed the max (default: 20 for review output, 30 for family files), the system uses category-aware eviction:

Oldest entries are removed first
Each category is guaranteed at least 1 slot
Slots are distributed proportionally to how many entries each category has

This preserves diversity — the agent doesn’t lose all behavioral guidance just because it learned many factual corrections.

Self-Corrections

When the agent is corrected in conversation, it uses the self_corrections field in its JSON response:

{
  "sms_response": "Got it — Degitu is your aunt, not grandmother. I've corrected that.",
  "self_corrections": [
    "Degitu is Liban's aunt (Roman's sister), not his grandmother. When Liban says 'she is my aunt,' accept it and move forward."
  ]
}

The system writes this to families/kano/lessons.md immediately. The agent sees it in context on the next message.

Why this matters: The agent doesn’t have to wait for a review cycle to learn from direct corrections. Self-corrections create an instant feedback loop.

Review Loop: Mechanical Tier

review_loop.py performs rule-based analysis on recent conversations:

What It Checks

Multi-question violations

Detects responses with more than one question mark.Skill violated: social.md says “One question at a time, always.”Lesson generated: “One question per response, always. Never stack multiple questions in a single message.”

Forbidden phrases

Detects phrases like:

“before I can proceed”
“before I save”
“before I can help”
“I need to know”

Skill violated: social.md says “Never say ‘before I can proceed’ — proceed with what you have.”Lesson generated: “Never say ‘[phrase]’. Act on available information per social.md.”

User feedback patterns

Detects corrections in user messages:

“that’s wrong” / “that’s not right” / “incorrect”
“don’t do that” / “stop doing” / “never do”
“I told you” / “I already said”

Flags these for deeper analysis (not auto-corrected).

Response length

Flags responses over 500 characters. Recommendation: “Keep SMS responses under 320 chars (2 segments) when possible.”

Usage

# Review last 24 hours
python review_loop.py --since 24h

# Review specific family with full transcript
python review_loop.py --since 24h --family kano --full

# Output as JSON for programmatic use
python review_loop.py --since 2h --json --full

# Stage findings without mutating real files
python review_loop.py --since 3h --family kano --full --stage

Without --stage, every run writes lessons to real files. --stage writes to a scratch pad (staging/reviews/) instead. Nothing touches production until explicitly promoted.

Review Loop: Contextual Tier

The mechanical tier catches rule violations. The contextual tier needs a human (or Opus) reading the full transcript:

What Mechanical Analysis Misses

Agent calling Degitu both “grandmother” and “aunt” in the same response
Missing member context that caused the confusion
Agent contradicting itself across messages
Flows that have no protocol
Process gaps nobody thought to codify

Workflow

Run with --full --stage

python review_loop.py --since 3h --family kano --full --stage

This writes findings + full transcript to staging/reviews/{timestamp}.json.

Review the transcript

Read the exchanges. Look for:

Contradictions
Missing context that caused errors
Patterns the agent should follow but doesn’t

Save interesting reviews

python review_staging.py save --family kano --review 2026-02-26_063623 --name family-tree-confusion

Moved to staging/saved/ — survives resets, becomes material for deeper analysis.

Promote approved lessons

python review_staging.py promote --family kano --review 2026-02-26_061700 --items 0,1

Writes selected lessons to families/kano/lessons.md. This is the one-way door — the only moment real files change.

Staging Workflow

Without staging, every review_loop run writes lessons to real files. Testing = mutating production data. Staging is a scratch pad — nothing touches production until you explicitly promote it.

Three Piles

reviews/
saved/
proposals/

Disposable test output. Accumulates with each --stage run. Cleared on reset. You don’t care about most of these.

Testing Protocol

Setup (once per session)

python review_staging.py snapshot --family kano

Locks baseline — this is your safety net.

Test loop (repeat as needed)

# Run tests (accumulates in reviews/)
python review_loop.py --since 3h --family kano --full --stage

# See what you got
python review_staging.py list --family kano

# Verify live files untouched
python review_staging.py diff --family kano

# Flag interesting ones
python review_staging.py save --family kano --review {ts} --name family-tree-confusion

# Restore baseline + clear reviews/ (saved/ untouched)
python review_staging.py reset --family kano

Then iterate. Change the agent, run more tests, save what’s interesting, reset, repeat.

Promote (one-way door)

python review_staging.py promote --family kano --review {ts} --items 0,1

Pushes approved lessons to production. This is the only moment real files change.

Resist the shiny object. During testing you WILL notice gaps — missing member fields, flows with no protocol, features that seem obvious. DO NOT stop testing to build them. Save the observation to saved/ and keep going. Real interactions reveal what the abstraction needs to be.

Graduation Pipeline

Lessons in lessons.md are temporary — they’re meant to graduate to permanent locations:

Factual lessons → family.md or members/{name}.md
Behavioral lessons → skills/social.md or skills/scheduling.md
Operational lessons → capabilities.md

Workflow

# Classify lessons and generate proposal
python review_staging.py graduate --family kano

# Review output, then merge approved items
python review_staging.py merge --family kano --graduation 2026-02-28_210000 --items 0,2,5

See Review Process for details.

Max Entries

Global lessons.md: 50 entries (configurable in learning/__init__.py)
Family lessons.md: 30 entries (set in append_lessons() calls)
Review output: 20 entries (prevents prompt bloat)

When the cap is reached, category-aware eviction removes oldest entries while preserving diversity.

Example: Real Correction Flow

User corrects agent

User: “Degitu is my aunt, not my grandmother.”Agent response JSON:

{
  "sms_response": "Got it — Degitu is your aunt. I've corrected that.",
  "self_corrections": [
    "Degitu is Liban's aunt (Roman's sister), not his grandmother. When Liban says 'she is my aunt,' accept it and move forward."
  ]
}

System writes to lessons.md

sms_handler.py processes the response and calls:

append_lessons(family_lessons_path, response.self_corrections)

Result in families/kano/lessons.md:

- [2026-02-26] Degitu is Liban's aunt (Roman's sister), not his grandmother. When Liban says 'she is my aunt,' accept it and move forward.

Agent sees it next message

The next time Liban sends a message, the system prompt includes:

## Lessons
- [2026-02-26] Degitu is Liban's aunt (Roman's sister), not his grandmother. When Liban says 'she is my aunt,' accept it and move forward.

The agent now knows not to call Degitu “grandmother” again.

Signal Sources

The review loop ingests signals from multiple sources:

Conversation logs (runtime/conversations/{phone}/*.log) — INBOUND→OUTBOUND pairs
PHI audit logs — blocked responses, unknown numbers
Pending approvals — stale approvals >18h old
Poller stdout — errors and warnings from the message polling loop

These are combined into a unified timeline for analysis.

Overview

Getting Started

Core Features

Care Protocols

Family Management

Agent System

Learning Architecture

lessons.md Format

Lesson Categories

Eviction Strategy

Self-Corrections

Review Loop: Mechanical Tier

What It Checks

Usage

Review Loop: Contextual Tier

What Mechanical Analysis Misses

Workflow

Staging Workflow

Three Piles

Testing Protocol

Graduation Pipeline

Workflow

Max Entries

Example: Real Correction Flow

Signal Sources

Build docs developers (and LLMs) love

Overview

Getting Started

Core Features

Care Protocols

Family Management

Agent System

​Learning Architecture

​lessons.md Format

​Lesson Categories

​Eviction Strategy

​Self-Corrections

​Review Loop: Mechanical Tier

​What It Checks

​Usage

​Review Loop: Contextual Tier

​What Mechanical Analysis Misses

​Workflow

​Staging Workflow

​Three Piles

​Testing Protocol

​Graduation Pipeline

​Workflow

​Max Entries

​Example: Real Correction Flow

​Signal Sources

Build docs developers (and LLMs) love

Learning Architecture

lessons.md Format

Lesson Categories

Eviction Strategy

Self-Corrections

Review Loop: Mechanical Tier

What It Checks

Usage

Review Loop: Contextual Tier

What Mechanical Analysis Misses

Workflow

Staging Workflow

Three Piles

Testing Protocol

Graduation Pipeline

Workflow

Max Entries

Example: Real Correction Flow

Signal Sources