Supported Formats
FHIR R4 JSON
HL7 FHIR R4 Bundles for standardized EHR integration
EHR Files
Upload PDF or CSV documents from electronic health records
Free Text
Natural language clinical notes and case descriptions
Voice Transcription
Speech-to-text clinical dictation (future: Whisper STT)
FHIR R4 JSON
Submit HL7 FHIR R4 Bundles via the/api/upload/fhir endpoint. The parser extracts:
- Patient → Demographics (age, gender)
- Condition → Diagnoses & medical history
- MedicationRequest / MedicationStatement → Current medications
- Observation → Vitals & lab results
- AllergyIntolerance → Drug allergies
FHIR Parser Implementation
The FHIR parser (backend/input_layer/fhir_parser.py) uses code-based extraction:
FHIR Compliance: Supports FHIR R4 only. The parser is lightweight and extracts only clinically relevant resources. Custom extensions are preserved in
raw_text.EHR Files (PDF, CSV)
Upload PDF or CSV documents from your EHR system via/api/upload/ehr.
Parser Extracts Text
- PDF: Uses PyPDF2 (fallback: Unstructured.io)
- CSV: Column-based extraction (auto-detects headers)
backend/input_layer/ehr_parser.py:88
Entity Extraction
Regex-based extraction identifies:
- Medications (drug name + dosage patterns)
- Conditions (ICD codes, common abbreviations like HTN, DM2)
- Vitals (BP, HR, temp, SpO2)
- Labs (troponin, WBC, creatinine, etc.)
backend/input_layer/text_parser.py:60
CSV Format Requirements
The CSV parser auto-detects common column names:| Column Name | Maps To | Example Value |
|---|---|---|
age | Demographics | 68 |
gender or sex | Demographics | male |
diagnosis or condition | Conditions | Type 2 Diabetes Mellitus |
medication or drug | Medications | Metformin |
dose or dosage | Medication dose | 1000mg |
lab or test | Lab results | HbA1c |
value or result | Lab value | 7.2 |
unit | Lab unit | % |
Free Text Input
Submit natural language clinical notes directly. The text parser uses regex + heuristics to extract structured data.- cURL
- Python
- JavaScript
Extraction Rules
The text parser recognizes:Demographics
Demographics
- Age:
68 yo,68-year-old,68 y.o. - Gender: Keywords like
male,female,man,woman,he,she
Medications
Medications
- Pattern:
drug_name dose frequency - Examples:
Lisinopril 20mg daily,Metformin 1000mg BID,Aspirin 81mg PRN - Section headers: “Medications:”, “Current Meds:”
Conditions
Conditions
- Common abbreviations:
HTN→ Hypertension,DM2→ Type 2 Diabetes,CAD→ Coronary Artery Disease - History patterns: “history of”, “h/o”, “diagnosed with”
Vitals
Vitals
- BP:
BP 145/92,BP: 120/80 mmHg - HR:
HR 88,heart rate 72 bpm - Temp:
temp 98.6°F,temperature 37.2°C - SpO2:
SpO2 95%,O2 sat 98%
Labs
Labs
- Common labs: troponin, WBC, Hgb, creatinine, BUN, eGFR, glucose, HbA1c, Na, K, D-dimer, BNP, lactate
- Pattern:
lab_name: value unit(e.g.,glucose: 180 mg/dL)
Allergies
Allergies
- Keywords: “allergic to”, “allergy:”, “allergies:”
- NKDA detection: “NKDA”, “no known drug allergies”
Voice Input
Voice transcription uses the same text parser as free text input.Future Feature: Whisper STT integration is planned. Currently, you must transcribe audio separately and submit as text.
Workflow
Unified PatientContext Schema
All parsers output this Pydantic model (backend/models/patient.py):
Best Practices
Use FHIR for Integration
If your EHR supports FHIR R4, use it for the most accurate data mapping.
Include Context in Free Text
Add patient history, vitals, and current symptoms for better analysis quality.
Check PHI Before Upload
Presidio auto-scrubs PHI, but review sensitive documents before submission.
Combine Formats
You can submit a FHIR bundle + free-text addendum by merging
patient_context with text.Next Steps
Emergency Mode
Fast-path triage for time-critical cases
AI Chat
Ask follow-up questions about cases