Skip to main content

Overview

The CallPacket is the core data structure that represents a fully analyzed emergency call. It merges outputs from both the audio track and NLP track into a single, structured object.

Schema Definition

# Source: app/schemas/call_packet.py:1-17
from pydantic import BaseModel
from typing import Optional, List

class NLPResult(BaseModel):
    transcript: str
    intent: Optional[str] = None
    summary: Optional[str] = None

class AudioResult(BaseModel):
    distress_score: float = 0.0
    hazards: List[str] = []

class CallPacket(BaseModel):
    call_id: str
    nlp: NLPResult
    audio: AudioResult
All CallPacket fields are validated using Pydantic models, ensuring type safety throughout the system.

NLPResult

Contains all natural language understanding outputs from the transcript.

Fields

transcript
string
required
The full, finalized speech-to-text transcript of the emergency call.Generated by Deepgram’s streaming STT API. Includes punctuation, smart formatting, and numeral recognition.
intent
string | null
The caller’s primary intent or need.Examples:
  • "request_ambulance"
  • "report_fire"
  • "report_violence"
  • "request_police"
summary
string | null
A 1-2 sentence dispatcher-friendly summary of the emergency situation.Generated using GPT-4o-mini with context from the category and tags:
# Source: app/agents/summary.py:56-62
prompt = (
    "You are an emergency dispatcher assistant. "
    "Summarize the caller's situation in 1–2 clear, factual sentences. "
    "Avoid speculation. Include critical details. "
    f"Category: {category}. Tags: {', '.join(tags)}.\n\n"
    f"Transcript:\n{transcript}"
)

Example

{
  "transcript": "There's been a shooting at 123 Main Street. Someone shot my friend. He's bleeding badly and unconscious.",
  "intent": "request_ambulance_police",
  "summary": "Gunshot victim at 123 Main Street. Male patient unconscious with heavy bleeding."
}

AudioResult

Contains acoustic and emotional analysis from the raw audio stream.

Fields

distress_score
float
default:"0.0"
Final distress level computed from the audio track (0.0 - 1.0).Calculated using an Exponential Moving Average (EMA) of audio loudness with detection of sudden intensity spikes:
# Source: app/api/ws/handler.py:486-489
ema = alpha * rms + (1 - alpha) * ema
diff = max(0.0, rms - ema)
score = max(signals["distress"] * 0.9, min(1.0, diff * 8.0))
Interpretation:
  • 0.0 - 0.15: Calm
  • 0.15 - 0.3: Tense
  • 0.3 - 0.7: Distressed
  • 0.7 - 1.0: Highly distressed
hazards
List[string]
default:"[]"
List of detected acoustic hazards or environmental sounds.Examples:
  • "sirens"
  • "gunshots"
  • "screaming_background"
  • "traffic_noise"
  • "breaking_glass"

Example

{
  "distress_score": 0.85,
  "hazards": ["sirens", "screaming_background"]
}

Complete CallPacket Example

{
  "call_id": "call_2026-03-03_abc123",
  "nlp": {
    "transcript": "Help! There's a fire in my apartment building. The smoke is getting really thick and I can't get to the stairs. I'm on the third floor.",
    "intent": "request_fire_rescue",
    "summary": "Apartment fire with trapped occupant on third floor. Heavy smoke blocking exit route."
  },
  "audio": {
    "distress_score": 0.92,
    "hazards": ["smoke_alarm"]
  }
}

Extended Analysis Fields

While not in the base CallPacket schema, the system tracks additional analysis in the live signals:

Emotion Classification

# Source: app/agents/emotion.py:14-20
EmotionLabel = Literal[
    "CALM",
    "RELIEVED",
    "TENSE",
    "DISTRESSED",
    "HIGHLY_DISTRESSED",
]
The emotion object includes:
{
  "label": "HIGHLY_DISTRESSED",
  "intensity": 0.85,
  "sentiment": "negative",
  "distress_input": 0.78,
  "has_transcript": true,
  "source": "heuristic"
}
label
EmotionLabel
Categorical emotion classification
intensity
float
Emotional intensity score (0.0 - 1.0)
sentiment
string
Overall sentiment: "positive", "neutral", or "negative"
source
string
Analysis method: "heuristic", "deepgram", or "openai"

Service Classification

# Source: app/agents/service_classify.py:8-22
def classify_service_and_tags(
    transcript: Optional[str],
    distress: float,
) -> Dict[str, Any]:
    """Heuristic classifier for:
      - service category: EMS / FIRE / POLICE / OTHER
      - semantic tags: TRAUMA, ACTIVE_SHOOTER, OVERDOSE, etc.
      - category_confidence: 0.0 - 1.0
    """
Output structure:
{
  "category": "EMS",
  "confidence": 0.89,
  "tags": ["ACTIVE_SHOOTER", "TRAUMA", "VIOLENCE"]
}
category
ServiceCategory
Primary emergency service needed: "EMS", "FIRE", "POLICE", or "OTHER"
confidence
float
Classification confidence (0.0 - 1.0). Higher when:
  • Clear separation between categories
  • Multiple supporting semantic tags
  • Distress level aligns with detected emergency
tags
List[string]
Semantic tags describing the emergency. Over 50 tags are detected, including:Medical/EMS:
  • NOT_BREATHING, CARDIAC_ARREST, OVERDOSE, UNCONSCIOUS, SEIZURE
  • MAJOR_BLEEDING, STROKE, ALLERGIC_REACTION, SUICIDAL
Violence/Police:
  • ACTIVE_SHOOTER, STABBING, ASSAULT, DOMESTIC_VIOLENCE, WEAPON_INVOLVED
Fire/Hazard:
  • FIRE, SMOKE, EXPLOSION, GAS_LEAK, HAZMAT
Rescue:
  • TRAPPED, COLLAPSE, FLOOD, VEHICLE_ACCIDENT

Semantic Tag Examples

The service classifier detects life-threatening keywords with negation awareness:
# Source: app/agents/service_classify.py:130-139
if has_any(shooting_phrases) or (
    has_word_any(gunshot_keywords) and not is_negated("shot")
):
    # Context: if "shoot" appears with basketball/sports, reduce confidence
    if not has_any(["basketball", "hoops", "game", "sport", "court"]):
        add_tag("ACTIVE_SHOOTER")
        add_tag("TRAUMA")
        add_tag("VIOLENCE")
        bump("EMS", 0.9, urgent=True)
        bump("POLICE", 0.9, urgent=True)
The tagger includes ASR (Automatic Speech Recognition) error tolerance. For example, “keep myself” is treated as “kill myself” since ASR commonly mishears this phrase.

Live Signals During Call

While the call is in progress, the system tracks real-time signals:
# Source: app/api/ws/handler.py:412-423
LIVE_SIGNALS[call_id] = {
    "chunks": 0,                  # Total audio chunks processed
    "voiced_chunks": 0,           # Chunks with detected speech
    "voiced_seconds": 0.0,        # Total speech duration
    "ema": 0.0,                   # Exponential moving average (smoothed loudness)
    "distress": 0.0,              # Current distress score
    "max_distress": 0.0,          # Peak distress observed
    "transcript": "",             # Finalized transcript
    "transcript_live": "",        # Real-time partial transcript
    "wav_path": None,             # Path to saved audio file
    "emotion": None,              # Emotion classification result
}
These signals are continuously updated every 160ms as audio chunks arrive.

Data Flow

Validation

All CallPacket instances are validated using Pydantic:
from app.schemas.call_packet import CallPacket

# This will raise ValidationError if data is malformed
packet = CallPacket(
    call_id="call_123",
    nlp={"transcript": "Help!"},
    audio={"distress_score": 0.8}
)

Usage in Ranking

The CallPacket feeds directly into the priority ranking system:
from app.ranking.ranking import build_ranking, RankingInputs

# Extract ranking inputs from packet
ranking = build_ranking(RankingInputs(
    risk_level="CRITICAL",           # Derived from distress_score
    risk_score=packet.audio.distress_score,
    category=classification.category,
    tags=classification.tags,
    created_at=call_created_timestamp
))

Streaming Pipeline

See how CallPackets are built from dual-track analysis

Priority Ranking

Learn how CallPackets are prioritized in the dispatch queue

Build docs developers (and LLMs) love