Skip to main content

Overview

CareSupport isn’t a chatbot following a script—it’s a learning agent that adapts to your family’s needs. It routes messages to the right AI model based on complexity, assembles context from multiple sources, and writes corrections to its own instruction files when you teach it something new.
The agent has memory. When you correct it (“Don’t say that”, “Remember this”), it writes the correction to lessons.md immediately. You’ll see the fix applied in the next conversation.

AI Backend: Anthropic + OpenRouter

CareSupport uses two AI providers:
Models:
  • Claude Haiku 4.5 (fast tier) — greetings, schedule updates, general coordination
  • Claude Sonnet 4.6 (reason tier) — medication changes, multi-member coordination, onboarding
  • Claude Opus 4.6 (critical tier) — emergencies, escalation triggers
Why Anthropic?
  • Prompt caching reduces cost by 90% for repeated context (family file, skills, lessons)
  • Native structured output (no JSON schema hacks)
  • Better instruction-following than GPT-4o for care coordination tasks
Cost optimization:
# Cached prefix: ~5,000 tokens (SOUL.md + skills + lessons + member context)
# Fresh on first call, cached for 5 minutes after
# Every subsequent message within 5 minutes:
#   - Reads 5,000 tokens from cache (10% of write cost)
#   - Writes only the new family context + current message
Set backend:
export CARESUPPORT_AI_BACKEND=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
Cross-provider resilience: If Anthropic fails after 3 retries, the handler automatically falls back to OpenRouter. You get a response even if one provider is down.

Intent Routing: Fast, Reason, Critical

Every message is classified into a tier BEFORE the AI call. This determines which model to use:
# From care_router.py:80-108
def route(message: str, member: dict) -> RouteResult:
    # Priority order — first match wins:

    if _EMERGENCY.search(message):
        return RouteResult("critical", OPUS, "EMERGENCY")
        # Keywords: 911, chest pain, can't breathe, fell, unconscious

    if _ESCALATION.search(message):
        return RouteResult("critical", OPUS, "ESCALATION")
        # Patterns: missed medication, no coverage, gap in care

    if _MEDICATION_CHANGE.search(message):
        return RouteResult("reason", SONNET, "MEDICATION_CHANGE")
        # Requests to start, stop, change, or adjust medications

    if _ONBOARDING.search(message):
        return RouteResult("reason", SONNET, "ONBOARDING")
        # Adding new members, setting up new caregivers

    if _MULTI_MEMBER.search(message):
        return RouteResult("reason", SONNET, "MULTI_MEMBER")
        # "Tell Solan and Roman both...", "Notify everyone..."

    return RouteResult("fast", HAIKU, "GENERAL")
Why pattern matching instead of AI classification?
  • Zero latency (no extra API call)
  • Zero cost (no tokens consumed)
  • Deterministic (same message always routes the same way)

Context Assembly: What the Agent Sees

Every message triggers a multi-source context load:
The agent’s core identity and reasoning framework.Loaded from: SOUL.md at repository rootContent:
  • Four-step reasoning loop (LISTEN → REASON → ACT → CLOSE THE LOOP)
  • Learning system explanation (“Your corrections become your instructions”)
  • Voice guidelines (“Match the family’s register. Use names, not roles.”)
  • Hard rules (“Never fabricate certainty about your own past actions”)
Why it’s separate from the system prompt: Identity should be version-controlled and human-readable. It changes rarely (once per quarter), so any edits are intentional and reviewed.
Tells the agent which skills and playbooks to load for different message types.Example:
## Routing
- Schedule/availability requests → tasks/scheduling.md
- Medication questions → tasks/medications.md
- New member setup → onboarding.md
- Urgent/emergency → tasks/escalations.md
Why routing is explicit: The agent shouldn’t guess which protocol to follow. If a message is about scheduling, it loads scheduling.md. If it’s about onboarding, it loads onboarding.md.
Loaded from: runtime/learning/capabilities.mdFormat:
## CAN
- Generate SMS responses
- Apply family_file_updates (append/prepend/replace to existing sections)
- Flag needs_outreach (requests to contact other team members)
- Persist self_corrections to lessons.md

## CANNOT
- Directly text people (outreach is queued, not real-time)
- Access external systems (no API calls, no database queries)
- Make medical decisions (always defer to providers)
- See data outside your filtered context (access level enforcement)
Why capabilities are explicit: The agent needs gates, not guidelines. “You cannot make medical decisions” is a constraint, not a suggestion.
Loaded from: runtime/learning/skills/*.mdExamples:
  • onboarding.md — How to welcome new members and explain CareSupport
  • social.md — How to handle greetings, gratitude, apologies
  • scheduling.md — How to coordinate rides, appointments, coverage
Format:
# Skill: Onboarding New Members

## When to use
- Someone says "What is this?" or "Who are you?"
- First message from a new phone number (after routing.json is updated)

## How to respond
1. Welcome them by name
2. Explain what CareSupport does for this family
3. Set expectations ("I coordinate schedules, track medications, and keep the team connected")
4. Ask what they need help with

## Don't
- Don't give a generic pitch ("I'm an AI assistant...")
- Don't over-explain how the system works
Two types:
  • Global lessons (runtime/learning/lessons.md) — corrections from all families
  • Family lessons (families/{id}/lessons.md) — corrections specific to this family
Format:
- [behavioral] When someone says "thanks", respond with "You're welcome" not "Happy to help!" (2026-02-27)
- [factual] Degitu's work is Downtown Minneapolis, not St. Paul (2026-02-26)
- [operational] Always populate needs_outreach when saying "I'll message [name]" (2026-02-28)
How lessons are created:
  1. User corrects the agent (“That’s wrong”, “Don’t say that again”)
  2. Agent captures it in self_corrections field of response
  3. System writes to lessons.md immediately
  4. Next message: agent sees the correction in its context
Max entries: 20 global, 30 per family (oldest entries are rotated out)
Loaded from: families/{id}/family.md, schedule.md, medications.mdPre-filtered by role_filter.py (see Enforcement & Safety page)What the agent sees:
  • Full-access members: Everything
  • Schedule+meds members: Schedule, medications, urgent notes (no insurance, no family-only discussions)
  • Schedule-only members: Schedule and urgent notes only
  • Limited members: Care recipient name and care team roster only
Loaded from: families/{id}/members/{first_name}.mdContent:
  • Communication preferences (“Prefers texts over calls”)
  • Care responsibilities (“Primary driver for Tuesday appointments”)
  • Personal context (“Works downtown, flexible schedule”)
  • Interaction history (“2026-02-27: Requested Yada be added to team”)
Why member profiles matter: The agent can adapt its tone, reference past conversations, and know who to contact for what.
Loaded from: conversations/{phone}/{YYYY-MM}.logLast 50 lines:
[2026-02-28 14:30:22 UTC] [INBOUND] Can someone take auntie to work tomorrow?
[2026-02-28 14:30:25 UTC] [OUTBOUND] I'll check with Solan and Roman about the 8am ride.
[2026-02-28 14:45:10 UTC] [INBOUND] Thanks
[2026-02-28 14:45:12 UTC] [OUTBOUND] You're welcome
Why 50 lines? Enough to maintain conversation continuity (“What did I ask about earlier?”) without overwhelming the context window.

Prompt Caching: 90% Cost Reduction

Anthropic’s prompt caching lets you mark sections of the system prompt as “cacheable”. If the cached prefix hasn’t changed, you pay 1/10th the cost to reload it.
Cached prefix (lasts 5 minutes):
  1. SOUL.md (identity) — never changes
  2. Routing + Capabilities + Skills — changes monthly
  3. Response format + channel guidance — never changes
  4. Lessons (global + family) — changes weekly
  5. Member identity + member profile — changes per member
Cache breakpoint: After block 5 (~5,000 tokens)Dynamic suffix (fresh every call): 6. Family context (filtered) — changes daily 7. Current datetime — changes every callResult: First message pays full cost. Every subsequent message within 5 minutes pays 10% of the cached portion.

Learning System: Self-Corrections

When you correct the agent, it writes the correction to its own instruction files:
Your correction:
You: Don't call her "grandmother"—she's my aunt.
Agent response (internal structure):
{
  "sms_response": "Got it — I'll make sure to call her your aunt.",
  "self_corrections": [
    "[factual] Degitu is Liban's aunt, not grandmother (2026-02-28)"
  ]
}
What happens next:
  1. System writes correction to families/kano/lessons.md
  2. Correction is loaded into agent context on the next message
  3. Agent sees: “Degitu is Liban’s aunt, not grandmother”
  4. Agent never makes that mistake again (for this family)
[behavioral] — How to reason or respond
  • Example: “When someone says ‘thanks’, respond with ‘You’re welcome’ not ‘Happy to help!’”
[factual] — Care facts about this family
  • Example: “Degitu’s work is Downtown Minneapolis, not St. Paul”
[operational] — System behavior
  • Example: “Always populate needs_outreach when saying ‘I’ll message [name]’”
CareSupport has a staging system for testing corrections without mutating production data:
# 1. Lock baseline (your safety net)
python runtime/scripts/review_staging.py snapshot --family kano

# 2. Run tests (writes to staging/reviews/, doesn't touch live files)
python runtime/scripts/review_loop.py --since 3h --family kano --full --stage

# 3. Save interesting findings
python runtime/scripts/review_staging.py save --family kano --review {timestamp} --name "family-tree-confusion"

# 4. Reset to baseline + clear test output (saved items survive)
python runtime/scripts/review_staging.py reset --family kano

# 5. Promote approved corrections to production
python runtime/scripts/review_staging.py promote --family kano --review {timestamp} --items 0,1
Why staging exists: Without it, every test run writes to real files. Staging is a scratch pad—nothing touches production until you explicitly promote it.

Response Structure

The agent always responds with structured JSON (enforced by Anthropic’s response_format parameter):
{
  "sms_response": "I'll check with Solan and Roman about the 8am ride.",
  "internal_notes": "Requesting ride for Mon 8am. Solan typically available mornings. Roman has flexible schedule.",
  "needs_outreach": [
    {
      "phone": "+16514109390",
      "name": "Solan",
      "message": "Hi Solan, can you drive Degitu to work Monday at 8am?"
    },
    {
      "phone": "+16516214824",
      "name": "Roman",
      "message": "Hi Roman, can you drive Degitu to work Monday at 8am if Solan can't?"
    }
  ],
  "family_file_updates": [
    {
      "section": "schedule",
      "operation": "replace",
      "old_content": "- Mon: TBD",
      "content": "- Mon 8:00am: Pending confirmation (Solan or Roman)"
    }
  ],
  "self_corrections": [],
  "member_updates": [],
  "routing_updates": []
}
Why JSON? The handler needs to parse the response programmatically. Plain text responses would require regex parsing (brittle) or additional AI calls (expensive).

Source Reference

  • Intent routing: runtime/scripts/care_router.py (route function, fallback_chain)
  • Prompt builder: runtime/scripts/prompt_builder.py (build_system_blocks, cache-aware assembly)
  • AI generation: sms_handler.py:487-625 (generate_response for OpenRouter, _generate_response_anthropic for Anthropic)
  • Context assembly: sms_handler.py:289-397 (build_system_context, _channel_guidance)
  • Learning persistence: sms_handler.py:628-683 (_persist_lessons, _stage_corrections)
  • Skills directory: runtime/learning/skills/ (onboarding.md, social.md, scheduling.md)
  • Lessons: runtime/learning/lessons.md (global), families/{id}/lessons.md (per-family)
Want to see how the agent learns? Read sms_handler.py:628-683 (_persist_lessons) to see how corrections flow from self_correctionslessons.md → next message context.

Build docs developers (and LLMs) love