Skip to main content
ClinicalPilot Hero Light

Why ClinicalPilot Exists

Most clinical AI tools are a single LLM call with a long prompt. That works for demos, not for patients. ClinicalPilot runs three specialized agents — Clinical, Literature, Safety — through a multi-round debate with an adversarial Critic. They argue. They cite evidence. They disagree. After 2-3 rounds, a Synthesizer merges their output into a structured SOAP note, and a Medical Error Prevention Panel runs in parallel catching drug interactions, dosing issues, and contraindications.
The result: Fewer hallucinations, more differential diagnoses, actual PubMed citations, and safety alerts that a single model would miss.

Core Features

Multi-Agent Debate

3 specialist agents + Critic engage in 2-3 adversarial rounds with consensus-or-flag-for-human-review

Emergency Triage

Bypasses debate entirely — ESI scoring in under 5 seconds with immediate action cards

Medical Error Prevention

Drug-drug interactions, contraindications, renal/hepatic dosing alerts, pregnancy/pediatric/elderly flags

FHIR R4 + EHR Upload

Drop in FHIR bundles, PDFs, CSVs, or type free-text clinical notes

PHI Anonymization

Microsoft Presidio scrubs protected health information before anything hits an LLM

RAG Pipeline

LanceDB vector store for medical literature, PubMed E-utilities for live citations

AI Chat

Groq-powered conversational assistant (Llama 3.3 70B) for quick clinical Q&A — sub-second responses

Human-in-the-Loop

Doctor edits feed back into the debate engine for re-analysis

How It Works

The full analysis pipeline makes ~14 LLM calls and takes approximately 100 seconds. For urgent cases, use Emergency Mode which targets sub-5 second responses.

Tech Stack

Backend

  • Python 3.10+
  • FastAPI + uvicorn
  • Async architecture

AI & Agents

  • GPT-4o / GPT-4o-mini
  • Groq Llama 3.3 70B
  • MedGemma (optional)

Data & Storage

  • LanceDB vector store
  • PubMed E-utilities
  • DrugBank + RxNorm + openFDA

Safety & Privacy

  • Microsoft Presidio
  • spaCy en_core_web_lg
  • PHI anonymization

Frontend

  • React 18 (CDN)
  • Tailwind CSS
  • Zero build step

Observability

  • LangSmith tracing
  • Token tracking
  • Latency breakdown

What You Get

  • Free-text or voice input
  • FHIR/CSV upload with sample data buttons
  • Real-time WebSocket pipeline stages
  • Full SOAP report with PDF export
  • Doctor feedback loop

Quick Example

Here’s what happens when you analyze a clinical case:
1

Input Patient Data

45-year-old male presenting with acute chest pain radiating to left arm,
diaphoresis, and shortness of breath. History of hypertension and type 2
diabetes. Current medications: metformin 1000mg BID, lisinopril 20mg daily.
2

Debate Process

  • Clinical Agent: Proposes differential diagnoses (AMI, unstable angina, PE)
  • Literature Agent: Searches PubMed for STEMI guidelines and diabetes cardiac risk
  • Safety Agent: Flags metformin in acute coronary syndrome
  • Critic: Challenges differential completeness, requests risk stratification
  • Round 2: Agents refine with TIMI score, troponin interpretation
3

Output SOAP Note

{
  "subjective": "45M presenting with acute chest pain...",
  "objective": "Vital signs pending. Clinical exam suggests ACS...",
  "assessment": [
    {"diagnosis": "STEMI", "confidence": "high", "icd10": "I21.9"},
    {"diagnosis": "Unstable Angina", "confidence": "medium", "icd10": "I20.0"},
    {"diagnosis": "Pulmonary Embolism", "confidence": "low", "icd10": "I26.9"}
  ],
  "plan": "Immediate: STEMI protocol activation, ASA 325mg, O2..."
}
4

Safety Alerts

  • Drug-Drug: Lisinopril + high-dose potassium → hyperkalemia risk
  • Dosing: Metformin — hold in acute illness (lactic acidosis risk)
  • Population: Diabetes increases cardiac event mortality 2-3x

Performance Metrics

Based on smoke test results:
MetricValue
Full Analysis Time~100-120 seconds
Emergency Mode<5 seconds
LLM Calls per Analysis~14 (3 rounds × 4 agents + synthesis)
Differential Diagnoses2-4 per case
PubMed Citations3-5 per case
Safety Flags1-3 per case
Ready to get started? Head to the Quickstart to run your first analysis in under 5 minutes.

Use Cases

Use Emergency Mode for rapid ESI scoring and red flag identification. The system bypasses the debate engine and returns actionable guidance in under 5 seconds.
The multi-agent debate process excels at generating comprehensive differential diagnoses by combining clinical reasoning, literature review, and safety analysis.
The Medical Error Prevention Panel runs in parallel on every case, checking for:
  • Drug-drug interactions (DrugBank + RxNorm + FDA)
  • Drug-disease contraindications
  • Renal/hepatic dosing adjustments
  • Pregnancy, pediatric, and elderly population flags
The debate process exposes the reasoning of each agent, making it valuable for:
  • Medical student case reviews
  • Resident training on differential diagnosis
  • Understanding evidence-based medicine workflows
Upload FHIR R4 bundles, PDFs, or CSV exports from any EHR system. The system handles:
  • FHIR resource parsing (Patient, Condition, Observation, MedicationRequest)
  • PDF text extraction
  • CSV structured data parsing
  • PHI anonymization before LLM processing

Next Steps

Quickstart

Get a working SOAP note analysis in 5 minutes

Installation

Complete setup guide with API keys and dependencies

Architecture

Deep dive into the multi-agent debate system

API Reference

Explore all endpoints and WebSocket streaming

Build docs developers (and LLMs) love