POST /api/chat

Overview

The /api/chat endpoint provides conversational AI assistance for clinical questions:

Evidence-based responses with guideline citations
Conversation history support for multi-turn dialogue
Automatic provider fallback (OpenAI → Groq)
Optimized for clinical decision support, not patient-facing chat

Chat uses a lightweight single-LLM approach without multi-agent debate. For complex case analysis, use the analyze endpoint.

Endpoint

POST http://localhost:8000/api/chat

Request Body

messages

array

required

Array of message objects representing the conversation history

Message Object Schema

role

string

required

Message role: user or assistant

content

string

required

Message content (the actual text)

The system prompt is automatically prepended to all conversations. You only need to include user and assistant messages.

Response

string

AI assistant’s response text

model

string

LLM model used (e.g., “gpt-4o”, “llama-3.3-70b-versatile”)

provider

string

Provider used: openai or groq

latency_ms

integer

Response time in milliseconds

tokens

integer

Total tokens consumed (prompt + completion)

System Prompt

The chat endpoint automatically uses the following system prompt:

You are ClinicalPilot AI, a clinical decision-support assistant for healthcare professionals.

Your role:
- Answer clinical questions with evidence-based, concise responses.
- Help with differential diagnoses, drug interactions, guideline lookups, lab interpretation, and clinical reasoning.
- Always cite relevant guidelines (e.g., ACC/AHA, WHO, UpToDate) when applicable.
- If a question involves patient safety, flag it clearly.
- Use structured formatting (bullet points, numbered lists) for clarity.
- If you are unsure, say so — never fabricate clinical information.

You are NOT a replacement for clinical judgment. Always remind users that your answers are for educational/decision-support purposes only.

Example: Single-Turn Conversation

curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "What is the HEART score and when should I use it?"
      }
    ]
  }'

Response:

{
  "reply": "The **HEART Score** is a clinical decision tool for risk-stratifying patients presenting with chest pain in the emergency department.\n\n**Components (0-2 points each):**\n- **H**istory: Highly suspicious (2), moderately suspicious (1), slightly suspicious (0)\n- **E**CG: Significant ST-depression (2), non-specific repolarization (1), normal (0)\n- **A**ge: ≥65 years (2), 45-64 years (1), <45 years (0)\n- **R**isk factors: ≥3 CV risk factors (2), 1-2 risk factors (1), none (0)\n- **T**roponin: ≥3x normal (2), 1-3x normal (1), normal (0)\n\n**Score Interpretation:**\n- **0-3**: Low risk (~1-2% MACE at 6 weeks) — consider discharge with outpatient follow-up\n- **4-6**: Moderate risk (~12-15% MACE) — admit for observation, serial troponins, stress test\n- **7-10**: High risk (~50-65% MACE) — admit, cardiology consult, likely catheterization\n\n**When to Use:**\n- ED patients with chest pain suspicious for ACS\n- After initial ECG and troponin\n- To guide disposition (discharge vs. admit vs. catheterization)\n\n**Reference:** Six et al., *Netherlands Heart Journal* 2008 (PMID: 18665203); validated in multiple cohorts.\n\n⚠️ **Safety Note:** The HEART score should NOT be used in STEMI (ST-elevation on ECG) — these patients require immediate catheterization regardless of score.",
  "model": "gpt-4o",
  "provider": "openai",
  "latency_ms": 1820,
  "tokens": 524
}

Example: Multi-Turn Conversation

curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "What is the HEART score and when should I use it?"
      },
      {
        "role": "assistant",
        "content": "The HEART Score is a clinical decision tool for risk-stratifying patients presenting with chest pain in the emergency department..."
      },
      {
        "role": "user",
        "content": "How does it compare to the TIMI score?"
      }
    ]
  }'

Response:

{
  "reply": "Great question — the **HEART** and **TIMI** scores serve similar purposes but have key differences:\n\n**HEART Score:**\n- **Population:** ED patients with *undifferentiated chest pain*\n- **Timing:** Use early (after initial ECG + first troponin)\n- **Goal:** Decide who can be safely discharged vs. admitted\n- **Advantages:** Simpler (5 variables), includes clinical gestalt (\"History\"), validated for ED triage\n\n**TIMI Score (for NSTE-ACS):**\n- **Population:** Patients with *confirmed NSTEMI/unstable angina*\n- **Timing:** Use after diagnosis established\n- **Goal:** Predict 14-day risk of death/MI/urgent revascularization\n- **Advantages:** More granular (7 variables), guides treatment intensity (e.g., early invasive strategy for TIMI ≥3)\n\n**Clinical Use:**\n- Use **HEART** for *frontline triage* in the ED (\"Can I discharge this patient?\")\n- Use **TIMI** for *risk stratification after ACS diagnosis* (\"How aggressive should my treatment be?\")\n\n**Example:** A patient with chest pain and normal troponin → use HEART to decide disposition. If troponin comes back elevated (NSTEMI) → use TIMI to guide antiplatelet therapy, timing of catheterization, etc.\n\n**References:**\n- HEART: Six et al. 2008 (PMID: 18665203)\n- TIMI: Antman et al. 2000 (PMID: 10938172)",
  "model": "gpt-4o",
  "provider": "openai",
  "latency_ms": 2150,
  "tokens": 782
}

Provider Fallback

The chat endpoint automatically tries OpenAI first, then falls back to Groq if:

OpenAI API key is not configured, OR
OpenAI request fails (rate limit, timeout, etc.)

You can check which provider is active using the /api/config-status endpoint.

Error Responses

400 Bad Request

{
  "detail": "No messages provided"
}

503 Service Unavailable

{
  "detail": "No LLM API key configured. Please set your OpenAI API key in Settings, or add GROQ_API_KEY to .env"
}

500 Internal Server Error

{
  "detail": "Chat failed: [error details]"
}

Use Cases

Drug Interactions

Quick lookups for medication safety questions

Differential Diagnosis

Explore differential diagnoses for a presentation

Guideline Lookups

Retrieve evidence-based treatment recommendations

Lab Interpretation

Interpret lab results in clinical context

Notes

Chat responses are for educational and decision-support purposes only. Always apply clinical judgment and verify critical information.

Future versions will integrate RAG (Retrieval-Augmented Generation) with a LanceDB vector store of indexed medical literature for more precise, evidence-backed answers.

Model Configuration

Default models (configurable in .env):

OpenAI: gpt-4o (or gpt-4o-mini for faster/cheaper responses)
Groq: llama-3.3-70b-versatile

You can override these in your environment variables:

OPENAI_MODEL=gpt-4o-mini
GROQ_MODEL=llama-3.3-70b-versatile

REST API

WebSocket

Data Models

Overview

Endpoint

Request Body

Message Object Schema

Response

System Prompt

Example: Single-Turn Conversation

Example: Multi-Turn Conversation

Provider Fallback

Error Responses

400 Bad Request

503 Service Unavailable

500 Internal Server Error

Use Cases

Drug Interactions

Differential Diagnosis

Guideline Lookups

Lab Interpretation

Notes

Model Configuration

Build docs developers (and LLMs) love

REST API

WebSocket

Data Models

​Overview

​Endpoint

​Request Body

​Message Object Schema

​Response

​System Prompt

​Example: Single-Turn Conversation

​Example: Multi-Turn Conversation

​Provider Fallback

​Error Responses

​400 Bad Request

​503 Service Unavailable

​500 Internal Server Error

​Use Cases

Drug Interactions

Differential Diagnosis

Guideline Lookups

Lab Interpretation

​Notes

​Model Configuration

Build docs developers (and LLMs) love

Overview

Endpoint

Request Body

Message Object Schema

Response

System Prompt

Example: Single-Turn Conversation

Example: Multi-Turn Conversation

Provider Fallback

Error Responses

400 Bad Request

503 Service Unavailable

500 Internal Server Error

Use Cases

Notes

Model Configuration