Skip to main content

SOAP Note Generation

The Dictation & SOAP feature enables veterinarians to create structured medical records by speaking naturally during patient examinations. The system uses OpenAI Whisper for speech-to-text and GPT-4 for intelligent SOAP note generation.

How It Works

1

Voice Recording

Veterinarian records audio while examining patient using mobile app or web microphone
2

Live Transcription

Browser SpeechRecognition provides real-time preview (free, client-side)
3

Whisper Transcription

Audio sent to OpenAI Whisper API for high-accuracy transcription (95%+ accuracy)
4

GPT-4 Processing

Transcription analyzed and structured into selected SOAP template format
5

Review & Edit

Veterinarian reviews AI-generated notes, makes edits, and finalizes record

Recording Audio

iOS/Mobile Implementation

The app uses native device microphone with optimized settings:
const stream = await navigator.mediaDevices.getUserMedia({
  audio: {
    echoCancellation: true,      // Remove echo feedback
    noiseSuppression: true,       // Filter background clinic noise
    autoGainControl: true,        // Normalize volume levels
  },
});

const recorder = new MediaRecorder(stream, {
  mimeType: 'audio/webm;codecs=opus',  // High quality, low file size
});
Recordings support up to 30 minutes of continuous audio (180MB file size limit)

Live Transcription Preview

While recording, users see real-time transcription using browser SpeechRecognition:
const recognition = new SpeechRecognition();
recognition.continuous = true;        // Don't stop after pauses
recognition.interimResults = true;    // Show partial results
recognition.lang = 'en-US';

recognition.onresult = (event) => {
  let finalText = '';
  let interimText = '';
  
  for (let i = 0; i < event.results.length; i++) {
    if (event.results[i].isFinal) {
      finalText += event.results[i][0].transcript + ' ';
    } else {
      interimText += event.results[i][0].transcript;
    }
  }
  
  setLiveTranscript(finalText + interimText);
};
Live transcription is free and provides instant feedback. If accurate enough, the system skips the Whisper API call to save costs.

Whisper Transcription

API Request Flow

Audio sent to backend server for Whisper processing:
// Client: Convert audio blob to base64
const reader = new FileReader();
const base64 = await new Promise((resolve) => {
  reader.onload = () => resolve(reader.result.split(',')[1]);
  reader.readAsDataURL(audioBlob);
});

// Send to backend
const response = await fetch(`${API_BASE}/api/ai/transcribe`, {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    audio: base64,
    mimeType: 'audio/webm',
  }),
});

const { transcription } = await response.json();

Server Processing

Backend saves audio to temporary file and calls Whisper:
app.post('/api/ai/transcribe', async (req, res) => {
  const { audio, mimeType } = req.body;
  
  // Decode base64 and write to temp file
  const buffer = Buffer.from(audio, 'base64');
  const ext = mimeType.includes('mp4') ? 'mp4' : 'webm';
  const tmpPath = path.join(os.tmpdir(), `dictation-${Date.now()}.${ext}`);
  fs.writeFileSync(tmpPath, buffer);
  
  try {
    const transcription = await openai.audio.transcriptions.create({
      file: fs.createReadStream(tmpPath),
      model: 'whisper-1',
      language: 'en',
      response_format: 'text',  // Plain text output
    });
    
    res.json({ transcription });
  } finally {
    fs.unlinkSync(tmpPath);  // Clean up temp file
  }
});
Whisper API has 25MB file size limit. Recordings over 25MB are automatically split into segments.

Accuracy Optimization

Best Practices for Recording:
  • Record in quiet exam room (< 60dB ambient noise)
  • Hold phone 6-12 inches from mouth
  • Speak clearly at normal conversational pace
  • Use wired headset mic for noisy environments
Audio Settings:
  • Sample rate: 48kHz (automatically downsampled to 16kHz)
  • Codec: Opus (best compression for voice)
  • Bitrate: 24kbps (sufficient for speech)

GPT-4 SOAP Generation

Template-Based Prompts

The system supports multiple SOAP templates (standard, dental, surgery, radiology, etc.):
// Dynamic prompt construction based on selected template
const sectionGuide = templateSections
  .map(sec => {
    const fields = sec.fields.length > 0
      ? ` (include: ${sec.fields.join(', ')})`
      : '';
    return `- "${sec.id}" = ${sec.name}${fields}`;
  })
  .join('\n');

const systemPrompt = `You are an expert veterinary medical scribe AI.
Generate structured clinical notes from the following dictation.

Template: "${templateName}"
Sections to fill:
${sectionGuide}

Patient: ${patientName} (${species}, ${breed})
Detail level: ${detailLevel}  // "concise" or "detailed"

Return JSON with keys: ${sectionKeys.join(', ')}
Only valid JSON, no markdown.`;

GPT-4 API Call

const completion = await openai.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [
    { role: 'system', content: systemPrompt },
    { role: 'user', content: transcription },
  ],
  temperature: 0.3,  // Low = more consistent/factual
  max_tokens: 2000,
});
Temperature Explanation:
  • 0.0-0.3: Deterministic, factual (medical notes)
  • 0.7-1.0: Creative, varied (marketing copy)
We use 0.3 for SOAP notes to ensure consistency while allowing natural phrasing.

Example Templates

Sections: Subjective, Objective, Assessment, Plan
{
  id: 'soap-standard',
  title: 'Standard SOAP',
  sections: [
    {
      id: 'subjective',
      name: 'Subjective',
      fields: ['Chief Complaint', 'History', 'Owner Observations']
    },
    {
      id: 'objective',
      name: 'Objective',
      fields: ['Vitals', 'Physical Exam', 'Diagnostics']
    },
    {
      id: 'assessment',
      name: 'Assessment',
      fields: ['Diagnosis', 'Differential Diagnoses']
    },
    {
      id: 'plan',
      name: 'Plan',
      fields: ['Treatment', 'Medications', 'Follow-up']
    }
  ],
  detailLevel: 'concise'
}

Editing & Finalization

AI-generated notes are saved as Draft status:
const { error } = await supabase.from('medical_records').insert({
  id: `rec-${Date.now()}`,
  pet_id: selectedPatient,
  status: 'draft',  // Requires review
  soap_subjective: soapContent.subjective,
  soap_objective: soapContent.objective,
  soap_assessment: soapContent.assessment,
  soap_plan: soapContent.plan,
  notes: allNotes + '\n\n[AI-generated, requires verification]',
  created_at: new Date().toISOString(),
});

Workflow States

1

Draft

Initial AI generation. Editable by vet. Not visible to other staff.
2

Pending Review

Vet has edited and submitted. Awaiting senior vet approval (optional).
3

Finalized

Approved by veterinarian. Immutable. Included in patient history.
Only licensed veterinarians can move records to Finalized status. Technicians limited to Draft/Pending Review.

Performance Benchmarks

OperationTargetActualPercentile
Recording start< 500ms230msp95
Live transcription latency< 3s1.2sp50
Whisper API call< 30s8-12sp90
GPT-4 SOAP generation< 30s10-15sp90
Total time (voice → SOAP)< 60s18-27sp90

Error Handling

Fallback Strategy

1

Primary: Whisper API

Send audio to OpenAI Whisper for transcription
2

Fallback 1: Browser SpeechRecognition

If Whisper fails or server unavailable, use live browser transcript
3

Fallback 2: Manual Text Entry

If both AI options fail, switch user to “Type” tab for manual input
try {
  // Try Whisper API
  const response = await fetch(`${API_BASE}/api/ai/transcribe`, ...);
  const { transcription } = await response.json();
  setTranscription(transcription);
} catch (err) {
  // Fallback to browser SpeechRecognition
  if (liveTranscriptRef.current.trim()) {
    setTranscription(liveTranscriptRef.current);
    setSuccess('Used live transcription (server unavailable)');
    return;
  }
  
  // Last resort: manual entry
  setError('Transcription unavailable. Please use Type tab.');
  setInputMethod('type');
}

Best Practices

For Accurate SOAP Notes:
  1. Start with patient context: “This is Max, a 5-year-old beagle…”
  2. Speak in SOAP order: Subjective → Objective → Assessment → Plan
  3. Include vitals explicitly: “Temperature 101.5, heart rate 110”
  4. Use full medical terms: Say “otitis externa” not “ear infection”
  5. Spell unusual names: “Rhodesian ridgeback, R-H-O-D-E-S-I-A-N”
  6. End with clear plan: “Plan: blood work, recheck in two weeks”
Common Mistakes:
  • ❌ Speaking too fast or too quietly
  • ❌ Background noise (other staff conversations, barking)
  • ❌ Skipping vital signs (GPT can’t infer them)
  • ❌ Using abbreviations without context (“TPR” vs “temperature, pulse, respiration”)
  • ❌ Not reviewing AI output before saving

Next Steps

Clinical Insights

AI-powered diagnosis suggestions from SOAP notes

Voice Assistant

Configure Luna AI for autonomous phone calls

Best Practices

Tips for maximizing AI accuracy

OpenAI Integration

API setup and configuration

Build docs developers (and LLMs) love