Skip to main content

Overview

The /api/chat endpoint provides conversational AI assistance for clinical questions:
  • Evidence-based responses with guideline citations
  • Conversation history support for multi-turn dialogue
  • Automatic provider fallback (OpenAI → Groq)
  • Optimized for clinical decision support, not patient-facing chat
Chat uses a lightweight single-LLM approach without multi-agent debate. For complex case analysis, use the analyze endpoint.

Endpoint

POST http://localhost:8000/api/chat

Request Body

messages
array
required
Array of message objects representing the conversation history

Message Object Schema

role
string
required
Message role: user or assistant
content
string
required
Message content (the actual text)
The system prompt is automatically prepended to all conversations. You only need to include user and assistant messages.

Response

reply
string
AI assistant’s response text
model
string
LLM model used (e.g., “gpt-4o”, “llama-3.3-70b-versatile”)
provider
string
Provider used: openai or groq
latency_ms
integer
Response time in milliseconds
tokens
integer
Total tokens consumed (prompt + completion)

System Prompt

The chat endpoint automatically uses the following system prompt:
You are ClinicalPilot AI, a clinical decision-support assistant for healthcare professionals.

Your role:
- Answer clinical questions with evidence-based, concise responses.
- Help with differential diagnoses, drug interactions, guideline lookups, lab interpretation, and clinical reasoning.
- Always cite relevant guidelines (e.g., ACC/AHA, WHO, UpToDate) when applicable.
- If a question involves patient safety, flag it clearly.
- Use structured formatting (bullet points, numbered lists) for clarity.
- If you are unsure, say so — never fabricate clinical information.

You are NOT a replacement for clinical judgment. Always remind users that your answers are for educational/decision-support purposes only.

Example: Single-Turn Conversation

curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "What is the HEART score and when should I use it?"
      }
    ]
  }'
Response:
{
  "reply": "The **HEART Score** is a clinical decision tool for risk-stratifying patients presenting with chest pain in the emergency department.\n\n**Components (0-2 points each):**\n- **H**istory: Highly suspicious (2), moderately suspicious (1), slightly suspicious (0)\n- **E**CG: Significant ST-depression (2), non-specific repolarization (1), normal (0)\n- **A**ge: ≥65 years (2), 45-64 years (1), <45 years (0)\n- **R**isk factors: ≥3 CV risk factors (2), 1-2 risk factors (1), none (0)\n- **T**roponin: ≥3x normal (2), 1-3x normal (1), normal (0)\n\n**Score Interpretation:**\n- **0-3**: Low risk (~1-2% MACE at 6 weeks) — consider discharge with outpatient follow-up\n- **4-6**: Moderate risk (~12-15% MACE) — admit for observation, serial troponins, stress test\n- **7-10**: High risk (~50-65% MACE) — admit, cardiology consult, likely catheterization\n\n**When to Use:**\n- ED patients with chest pain suspicious for ACS\n- After initial ECG and troponin\n- To guide disposition (discharge vs. admit vs. catheterization)\n\n**Reference:** Six et al., *Netherlands Heart Journal* 2008 (PMID: 18665203); validated in multiple cohorts.\n\n⚠️ **Safety Note:** The HEART score should NOT be used in STEMI (ST-elevation on ECG) — these patients require immediate catheterization regardless of score.",
  "model": "gpt-4o",
  "provider": "openai",
  "latency_ms": 1820,
  "tokens": 524
}

Example: Multi-Turn Conversation

curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "What is the HEART score and when should I use it?"
      },
      {
        "role": "assistant",
        "content": "The HEART Score is a clinical decision tool for risk-stratifying patients presenting with chest pain in the emergency department..."
      },
      {
        "role": "user",
        "content": "How does it compare to the TIMI score?"
      }
    ]
  }'
Response:
{
  "reply": "Great question — the **HEART** and **TIMI** scores serve similar purposes but have key differences:\n\n**HEART Score:**\n- **Population:** ED patients with *undifferentiated chest pain*\n- **Timing:** Use early (after initial ECG + first troponin)\n- **Goal:** Decide who can be safely discharged vs. admitted\n- **Advantages:** Simpler (5 variables), includes clinical gestalt (\"History\"), validated for ED triage\n\n**TIMI Score (for NSTE-ACS):**\n- **Population:** Patients with *confirmed NSTEMI/unstable angina*\n- **Timing:** Use after diagnosis established\n- **Goal:** Predict 14-day risk of death/MI/urgent revascularization\n- **Advantages:** More granular (7 variables), guides treatment intensity (e.g., early invasive strategy for TIMI ≥3)\n\n**Clinical Use:**\n- Use **HEART** for *frontline triage* in the ED (\"Can I discharge this patient?\")\n- Use **TIMI** for *risk stratification after ACS diagnosis* (\"How aggressive should my treatment be?\")\n\n**Example:** A patient with chest pain and normal troponin → use HEART to decide disposition. If troponin comes back elevated (NSTEMI) → use TIMI to guide antiplatelet therapy, timing of catheterization, etc.\n\n**References:**\n- HEART: Six et al. 2008 (PMID: 18665203)\n- TIMI: Antman et al. 2000 (PMID: 10938172)",
  "model": "gpt-4o",
  "provider": "openai",
  "latency_ms": 2150,
  "tokens": 782
}

Provider Fallback

The chat endpoint automatically tries OpenAI first, then falls back to Groq if:
  1. OpenAI API key is not configured, OR
  2. OpenAI request fails (rate limit, timeout, etc.)
You can check which provider is active using the /api/config-status endpoint.

Error Responses

400 Bad Request

{
  "detail": "No messages provided"
}

503 Service Unavailable

{
  "detail": "No LLM API key configured. Please set your OpenAI API key in Settings, or add GROQ_API_KEY to .env"
}

500 Internal Server Error

{
  "detail": "Chat failed: [error details]"
}

Use Cases

Drug Interactions

Quick lookups for medication safety questions

Differential Diagnosis

Explore differential diagnoses for a presentation

Guideline Lookups

Retrieve evidence-based treatment recommendations

Lab Interpretation

Interpret lab results in clinical context

Notes

Chat responses are for educational and decision-support purposes only. Always apply clinical judgment and verify critical information.
Future versions will integrate RAG (Retrieval-Augmented Generation) with a LanceDB vector store of indexed medical literature for more precise, evidence-backed answers.

Model Configuration

Default models (configurable in .env):
  • OpenAI: gpt-4o (or gpt-4o-mini for faster/cheaper responses)
  • Groq: llama-3.3-70b-versatile
You can override these in your environment variables:
OPENAI_MODEL=gpt-4o-mini
GROQ_MODEL=llama-3.3-70b-versatile

Build docs developers (and LLMs) love