Skip to main content
ClinicalPilot’s Human-in-the-Loop (HITL) workflow allows clinicians to provide feedback on AI-generated SOAP notes and trigger re-analysis with up to 1 iteration of the debate pipeline.

Overview

Iterative Refinement

Doctors edit SOAP notes and add feedback → system re-runs analysis

Max 1 Re-Debate

Prevents infinite loops — only 1 re-analysis per case allowed

Full Context Preservation

Original patient data + doctor feedback merged for re-analysis

Audit Trail

All edits and feedback logged for compliance and learning

Workflow

1

Initial AI Analysis

Submit case via /api/analyze → receive SOAP note
response = requests.post(
    "https://api.clinicalpilot.ai/api/analyze",
    json={"text": "68yo male with chest pain, troponin 1.2..."}
)
soap = response.json()["soap"]
2

Doctor Reviews Output

Clinician reads the SOAP note and identifies:
  • Missing differentials
  • Incorrect assessments
  • Overlooked safety concerns
  • Desired plan modifications
3

Doctor Provides Feedback

Two input methods:
Doctor directly edits the SOAP text:
{
  "edited_soap": "Assessment: Top differential is NSTEMI, not unstable angina. Patient has elevated troponin (1.2) and dynamic ECG changes. Risk factors: HTN, DM2, smoking.",
  "original_text": "68yo male with chest pain, troponin 1.2...",
  "feedback": ""
}
4

Submit Feedback

POST to /api/human-feedback:
curl -X POST https://api.clinicalpilot.ai/api/human-feedback \
  -H "Content-Type: application/json" \
  -d '{
    "edited_soap": "...",
    "original_text": "68yo male with chest pain, troponin 1.2...",
    "feedback": "Consider NSTEMI over unstable angina..."
  }'
5

System Re-Analyzes

ClinicalPilot merges original case + feedback:
backend/main.py:278
# Re-run with feedback as additional context
enhanced_text = f"{original_text}\n\n---\nDoctor Feedback: {feedback}\nEdited SOAP: {edited_soap}"
request = AnalysisRequest(text=enhanced_text)
soap, debate_state = await full_pipeline(request)
The full debate pipeline runs again:
  • Clinical Agent sees doctor’s corrections
  • Literature Agent searches based on feedback keywords
  • Safety Agent re-checks with new considerations
  • Critic Agent reconciles AI + human input
6

Return Updated SOAP

Response includes:
  • New SOAP note (incorporating feedback)
  • Updated debate summary
  • Note indicating re-analysis completed
{
  "soap": { ... },
  "debate": { ... },
  "note": "Re-analysis with human feedback completed"
}

API Reference

Endpoint

/api/human-feedback
POST
Submit doctor feedback and trigger re-analysis

Request Body

{
  "edited_soap": "string",      // Optional: Doctor's edited SOAP text
  "original_text": "string",    // Required: Original patient data submitted to /api/analyze
  "feedback": "string"          // Optional: Doctor's commentary/corrections
}
edited_soap
string
The clinician’s manually edited SOAP note. If provided, the re-analysis will prioritize this version.
original_text
string
required
The original clinical input text from the initial /api/analyze request. This ensures the system has full patient context.
feedback
string
Structured feedback from the clinician (e.g., “Missing differential: aortic dissection. Add CT angio to plan.”). This is appended to the prompt for re-analysis.

Response

{
  "soap": {
    "subjective": "...",
    "objective": "...",
    "assessment": "...",
    "plan": "...",
    "differentials": [...],
    "safety_flags": [...]
  },
  "debate": {
    "round_number": 3,
    "final_consensus": true,
    "clinical_outputs": [...],
    "critic_outputs": [...]
  },
  "note": "Re-analysis with human feedback completed"
}

Code Implementation

backend/main.py:262
@app.post("/api/human-feedback", response_model=dict)
async def human_feedback(payload: dict):
    """
    Human-in-the-loop: doctor edits SOAP → triggers re-debate (max 1 iteration).
    """
    edited_soap = payload.get("edited_soap", "")
    original_text = payload.get("original_text", "")
    feedback = payload.get("feedback", "")

    if not edited_soap and not feedback:
        raise HTTPException(400, "Must provide edited_soap or feedback")

    try:
        from backend.agents.orchestrator import full_pipeline
        from backend.models.patient import AnalysisRequest

        # Re-run with feedback as additional context
        enhanced_text = f"{original_text}\n\n---\nDoctor Feedback: {feedback}\nEdited SOAP: {edited_soap}"
        request = AnalysisRequest(text=enhanced_text)
        soap, debate_state = await full_pipeline(request)

        return {
            "soap": soap.model_dump(),
            "debate": debate_state.model_dump(),
            "note": "Re-analysis with human feedback completed",
        }
    except Exception as e:
        logger.exception("Human feedback re-analysis failed")
        raise HTTPException(500, str(e))

Use Cases

Scenario: AI outputs differential for chest pain but misses aortic dissection.Doctor Action:
{
  "feedback": "Patient has back pain radiating to shoulders — consider aortic dissection. Add CT angio chest to rule out before starting anticoagulation."
}
Re-Analysis Output:
  • Aortic dissection added to differentials
  • Plan updated: “CT angio chest to r/o dissection before heparin”
  • Safety flag: “Hold anticoagulation until dissection ruled out”
Scenario: AI classifies NSTEMI as low-risk (HEART score 3), but patient has 3-vessel CAD.Doctor Action:
{
  "feedback": "Patient has known 3-vessel CAD from prior cath (2024-01-15). This is high-risk ACS, not low-risk. Recommend cath, not stress test."
}
Re-Analysis Output:
  • Risk stratification corrected to high-risk
  • Plan changed from “Outpatient stress test” to “Admit, cardiology consult, likely cath within 24h”
Scenario: AI plan includes metformin, but patient has acute kidney injury (Cr 3.2).Doctor Action:
{
  "feedback": "Patient has AKI (Cr 3.2, baseline 1.1). Metformin is contraindicated. Hold metformin, start insulin sliding scale instead."
}
Re-Analysis Output:
  • Safety Agent flags metformin contraindication
  • Plan updated: “Hold metformin. Start insulin SSI. Recheck Cr in 24h.”
Scenario: AI recommends older treatment protocol (e.g., 2019 sepsis guidelines), but institution uses updated 2023 protocol.Doctor Action:
{
  "feedback": "Use 2023 Surviving Sepsis Campaign guidelines: 1-hour bundle, lactate-guided resuscitation, early vasopressors if MAP <65 after 30mL/kg bolus."
}
Re-Analysis Output:
  • Plan updated to reflect 2023 guidelines
  • Citations updated to reference 2023 Surviving Sepsis Campaign

Design Rationale

Why Max 1 Re-Debate?

The system limits re-analysis to 1 iteration to:
  1. Prevent Infinite Loops: Without a cap, doctors could repeatedly re-submit feedback, causing exponential LLM calls.
  2. Encourage Finalization: After 1 re-analysis, the doctor should finalize the SOAP manually if still unsatisfied.
  3. Cost Control: Each full pipeline run costs ~0.500.50-1.00 in LLM API calls. Unlimited iterations would be prohibitively expensive.
Future Enhancement: Allow configurable max_iterations per user role (e.g., attending = 2, resident = 1).

How Feedback Is Merged

The system appends feedback to the original text:
enhanced_text = f"{original_text}\n\n---\nDoctor Feedback: {feedback}\nEdited SOAP: {edited_soap}"
This ensures:
  • Full Context: Agents see original patient data + doctor’s corrections
  • Clear Delineation: Separator (---) marks where human input begins
  • Prompt Engineering: Agents are trained to weigh human feedback heavily in debate rounds

Frontend Integration

The ClinicalPilot frontend provides a SOAP Editor for HITL:
frontend/index.html (excerpt)
function submitFeedback() {
  const editedSOAP = document.getElementById('soap-editor').value;
  const feedback = document.getElementById('feedback-textarea').value;
  const originalText = sessionStorage.getItem('original_case_text');

  fetch('/api/human-feedback', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      edited_soap: editedSOAP,
      original_text: originalText,
      feedback: feedback
    })
  })
  .then(res => res.json())
  .then(data => {
    // Display updated SOAP
    renderSOAP(data.soap);
    showNotification('Re-analysis complete — review updated SOAP note');
  });
}

UI Flow

1

Initial SOAP Display

After analysis completes, SOAP is shown with an “Edit SOAP” button.
2

Doctor Clicks Edit

SOAP text becomes editable in a <textarea>. A separate “Feedback” field appears for commentary.
3

Submit Feedback

Doctor clicks “Re-Analyze with Feedback” → triggers /api/human-feedback.
4

Loading State

UI shows spinner: “Re-running analysis with your feedback… (~100s)”
5

Updated SOAP Displayed

New SOAP replaces old version. Banner indicates: “Updated based on your feedback.”

Audit Trail

For compliance (HIPAA, medico-legal), all HITL interactions should be logged:
# Pseudocode for future audit logging
import logging

audit_logger = logging.getLogger("audit")

@app.post("/api/human-feedback")
async def human_feedback(payload: dict, user: User = Depends(get_current_user)):
    audit_logger.info(
        f"HITL Re-Analysis | User: {user.email} | "
        f"Case ID: {case_id} | Feedback: {payload.get('feedback')[:100]}..."
    )
    # ... re-run pipeline ...
    audit_logger.info(
        f"HITL Complete | Case ID: {case_id} | Updated SOAP saved"
    )
Production Requirement: Implement audit logging before deploying to clinical environments. Logs should capture:
  • User ID
  • Timestamp
  • Original SOAP
  • Edited SOAP
  • Feedback text
  • Re-analysis output

Performance Considerations

Latency

StepDurationNotes
Submit feedback~50msAPI call
Re-run full pipeline~100sSame as initial analysis
Update frontend~200msRender new SOAP
Total~100sSame cost as initial run
Cost: Each HITL re-analysis costs the same as the initial analysis (~0.500.50-1.00 in LLM API fees).

Optimization Ideas

Delta Re-Analysis

Only re-run agents affected by feedback (e.g., if feedback is about differentials, skip Safety Agent)

Caching

Cache Literature Agent PubMed results if feedback doesn’t change search queries

Async Notification

Return immediately, send email/SMS when re-analysis completes (for long cases)

Partial SOAP Update

Allow doctors to flag specific sections for re-generation (“Re-generate Plan only”)

Best Practices

1

Be Specific in Feedback

✅ Good: “Add CT angio to rule out aortic dissection — patient has back pain radiating to shoulders”❌ Bad: “Plan is incomplete”
2

Reference Clinical Data

Cite labs, vitals, or exam findings:✅ “Troponin is 1.2 (not 0.12) — this is NSTEMI, not unstable angina”
3

Suggest Evidence-Based Changes

Reference guidelines when correcting:✅ “Per 2023 AHA STEMI guidelines, door-to-balloon should be <90 min, not <120 min”
4

Use Edited SOAP for Major Rewrites

If >50% of SOAP needs changes, edit the SOAP directly rather than writing long feedback.

Limitations

Known Constraints:
  1. Max 1 Iteration: After 1 re-analysis, further changes require manual SOAP editing (no more AI re-runs).
  2. No Session Persistence: If user refreshes the page, original case text must be re-entered.
  3. Feedback Format: Currently free text. Future: structured feedback fields (“Add differential”, “Correct plan”).
  4. No Multi-User Collaboration: If 2 doctors edit the same case, last submission wins (no merge conflict resolution).

Future Enhancements

Structured Feedback Forms

Guided UI: “Add differential”, “Flag safety issue”, “Correct lab value”

Version History

Track all SOAP versions (v1 = AI, v2 = after feedback, v3 = manual edits)

Multi-User Review

Allow attending + resident to both provide feedback → system reconciles

Reinforcement Learning

Use doctor corrections to fine-tune Clinical Agent (RLHF)

Next Steps

Full Analysis API

Learn how the multi-agent debate pipeline works

Emergency Mode

Fast-path triage for time-critical cases

Build docs developers (and LLMs) love