Groq-powered conversational clinical Q&A with sub-second responses
The AI Chat feature provides fast conversational clinical Q&A powered by Groq’s Llama 3.3 70B, delivering sub-second responses without the overhead of the full agent pipeline.
AI Chat maintains full conversation history for context-aware follow-ups.
Python SDK
JavaScript SDK
conversation = [ {"role": "user", "content": "What are the causes of elevated troponin?"},]# First messageresponse = requests.post( "https://api.clinicalpilot.ai/api/chat", json={"messages": conversation})assistant_reply = response.json()["reply"]conversation.append({"role": "assistant", "content": assistant_reply})# Follow-up questionconversation.append({ "role": "user", "content": "Which of those causes would present with ST elevation?"})response = requests.post( "https://api.clinicalpilot.ai/api/chat", json={"messages": conversation})print(response.json()["reply"])
const conversation = [ { role: "user", content: "What are the causes of elevated troponin?" }];// First messagelet response = await fetch("https://api.clinicalpilot.ai/api/chat", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ messages: conversation })});let data = await response.json();conversation.push({ role: "assistant", content: data.reply });// Follow-up questionconversation.push({ role: "user", content: "Which of those causes would present with ST elevation?"});response = await fetch("https://api.clinicalpilot.ai/api/chat", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ messages: conversation })});data = await response.json();console.log(data.reply);
The conversation array grows with each exchange. The full history is sent on every request, so the model maintains context.
The AI Chat endpoint uses a dedicated clinical system prompt optimized for decision support:
backend/main.py:368
CHAT_SYSTEM_PROMPT = """You are ClinicalPilot AI, a clinical decision-support assistant for healthcare professionals.Your role:- Answer clinical questions with evidence-based, concise responses.- Help with differential diagnoses, drug interactions, guideline lookups, lab interpretation, and clinical reasoning.- Always cite relevant guidelines (e.g., ACC/AHA, WHO, UpToDate) when applicable.- If a question involves patient safety, flag it clearly.- Use structured formatting (bullet points, numbered lists) for clarity.- If you are unsure, say so — never fabricate clinical information.You are NOT a replacement for clinical judgment. Always remind users that your answers are for educational/decision-support purposes only.IMPORTANT: In future you will have access to a LanceDB vector store with indexed medical literature for RAG-enhanced answers. For now, rely on your training knowledge."""
@app.post("/api/chat")async def chat(payload: dict): """AI chat — tries OpenAI first, falls back to Groq.""" messages = payload.get("messages", []) if not messages: raise HTTPException(400, "No messages provided") openai_key = get_effective("openai_api_key") groq_key = get_effective("groq_api_key") if not openai_key and not groq_key: raise HTTPException( 503, "No LLM API key configured. Please set your OpenAI API key in Settings, or add GROQ_API_KEY to .env" ) full_messages = [{"role": "system", "content": CHAT_SYSTEM_PROMPT}] + [ {"role": m["role"], "content": m["content"]} for m in messages ] # Try OpenAI first if openai_key: try: from openai import OpenAI as SyncOpenAI client = SyncOpenAI(api_key=openai_key) t0 = time.time() completion = client.chat.completions.create( model=settings.openai_model, messages=full_messages, temperature=0.3, max_tokens=2048, ) latency_ms = int((time.time() - t0) * 1000) reply = completion.choices[0].message.content return { "reply": reply, "model": settings.openai_model, "provider": "openai", "latency_ms": latency_ms, "tokens": completion.usage.total_tokens, } except Exception as e: logger.warning(f"OpenAI chat failed ({e}), falling back to Groq") if not groq_key: raise HTTPException(500, f"OpenAI chat failed and no Groq fallback: {str(e)}") # Fallback to Groq try: from groq import Groq client = Groq(api_key=groq_key) t0 = time.time() completion = client.chat.completions.create( model=settings.groq_model, messages=full_messages, temperature=0.3, max_tokens=2048, ) latency_ms = int((time.time() - t0) * 1000) reply = completion.choices[0].message.content return { "reply": reply, "model": settings.groq_model, "provider": "groq", "latency_ms": latency_ms, "tokens": completion.usage.total_tokens, } except Exception as e: logger.exception("Chat failed on all providers") raise HTTPException(500, f"Chat failed: {str(e)}")
Why OpenAI First? If both keys are set, OpenAI GPT-4o typically has higher clinical accuracy than Llama 3.3 70B. Groq is used as a speed fallback when OpenAI is unavailable or rate-limited.
# Groq (required for AI Chat)GROQ_API_KEY=gsk_...GROQ_MODEL=llama-3.3-70b-versatile # Default model# OpenAI (optional, used if available)OPENAI_API_KEY=sk-proj-...OPENAI_MODEL=gpt-4o
Example: “What’s the Wells score for DVT?”Response:
**Wells Score for DVT**| Criterion | Points ||-----------|--------|| Active cancer | +1 || Paralysis/recent plaster cast | +1 || Bedridden >3 days or major surgery <12 weeks | +1 || Tenderness along deep vein | +1 || Entire leg swollen | +1 || Calf swelling >3cm vs. other leg | +1 || Pitting edema | +1 || Collateral superficial veins | +1 || Alternative diagnosis more likely | -2 |**Interpretation**:- ≥2: DVT likely (D-dimer if negative → US)- <2: DVT unlikely (D-dimer if negative → stop workup)
Lab Interpretation
Example: “Patient has K+ 5.8. What should I do?”Response:
**Hyperkalemia (K+ 5.8 mEq/L)****Severity**: Moderate (5.5-6.5)**Immediate Actions**:1. **ECG** — Look for peaked T waves, widened QRS, loss of P waves2. If ECG changes present: - Calcium gluconate 10% 10mL IV over 2-3 min (membrane stabilization)3. Shift K+ intracellularly: - Insulin 10 units IV + D50 25g IV - Albuterol 10-20mg nebulized4. Remove K+ from body: - Furosemide 40mg IV (if euvolemic) - Patiromer or sodium polystyrene sulfonate (slower)**Recheck K+ in 2h****⚠️ Hold ACE-I, ARB, spironolactone until K+ normalized**
Differential Diagnosis Brainstorming
Example: “65yo with new onset atrial fibrillation. What’s on the differential?”Response: