Overview
Every news item passes through a three-stage classification pipeline that provides instant results while progressively refining threat assessments using ML and LLM:- Keyword classifier (instant,
source: 'keyword') — ~120 threat keywords across 5 severity tiers - Browser-side ML (async,
source: 'ml') — Transformers.js NER + sentiment analysis - LLM classifier (batched async,
source: 'llm') — Groq Llama 3.1 8B or Ollama local
The UI is never blocked waiting for AI. Users see keyword results instantly, with ML/LLM refinements arriving within seconds and persisting for all subsequent visitors.
Stage 1: Keyword Classifier
Pattern-matches against ~120 threat keywords organized by severity tier and event category.Severity Tiers
CRITICAL (confidence 0.9)
CRITICAL (confidence 0.9)
Existential threats and major escalation:Military/Conflict:
nuclear strike,nuclear attack,nuclear warinvasion,declaration of war,declares warall-out war,full-scale warmartial law,coup,coup attemptgenocide,ethnic cleansingmassive strikes,military strikes,retaliatory strikes
attack iran,attacks iran,strikes iranwar with iran,war on iraniran retaliates,iran strikes,iran attacks
chemical attack,biological attack,dirty bomb
pandemic declared,health emergency
nato article 5
nuclear meltdown,evacuation order
- “Russia invades Baltic states” →
critical: conflict - “Iran launches retaliatory strikes” →
critical: military - “NATO invokes Article 5” →
critical: military
HIGH (confidence 0.8)
HIGH (confidence 0.8)
Active conflict and severe threats:Conflict:
war,armed conflictairstrike,drone strike,bombing,shellingcasualties,killed instrike on,attack on,launches attack
missile,missile launch,missiles firedtroops deployed,military escalationground offensive,military operationballistic missile,cruise missile
hostage,terrorist,terror attack,assassination
cyber attack,ransomware,data breach
sanctions,embargo
earthquake,tsunami,hurricane,typhoon
critical: military (escalation logic)Source: src/services/threat-classifier.ts:329-337MEDIUM (confidence 0.7)
MEDIUM (confidence 0.7)
Political instability and infrastructure disruption:
protest,riot,unrest,demonstrationmilitary exercise,naval exercisearms deal,weapons salediplomatic crisis,ambassador recalled,expel diplomatstrade war,tariff,recession,inflationmarket crashflood,wildfire,volcano,eruptionoutbreak,epidemicoil spill,pipeline explosionblackout,power outage,internet outagederailment
LOW (confidence 0.6)
LOW (confidence 0.6)
Diplomatic activity and low-intensity events:
election,vote,referendumsummit,treaty,agreement,negotiationtalks,peacekeeping,humanitarian aidceasefire,peace treatyclimate change,emissions,pollutionvaccine,vaccination,disease,virusinterest rate,gdp,unemployment,regulation
INFO (confidence 0.3)
INFO (confidence 0.3)
General news with no specific threat classification.Exclusions: Headlines containing lifestyle/entertainment keywords are auto-classified as INFO to prevent false positives:
protein,couples,relationship,datingdiet,fitness,recipe,cookingshopping,fashion,celebrity,movietv show,sports,game,concertstrikes deal,strikes agreement(not military strikes)
Event Categories
conflict
Wars, battles, armed clashes
protest
Civil unrest, demonstrations
military
Troop movements, exercises
terrorism
Attacks, hostage situations
cyber
Hacking, data breaches
disaster
Natural disasters, accidents
diplomatic
Treaties, summits, negotiations
economic
Sanctions, market events
health
Pandemics, outbreaks
environmental
Climate, pollution, spills
infrastructure
Outages, pipeline explosions
crime
Assassinations, organized crime
tech
Tech-specific events (variant)
general
Uncategorized news
Keyword Matching Logic
Short keywords (≤5 chars) use
\b word boundaries to prevent false positives:warmatches “war in Ukraine” but not “award ceremony”riotmatches “riot police” but not “patriot”hackmatches “data hack” but not “hackathon”
war, coup, ban, vote, riot, hack, talks, ipo, gdp, virus, disease, flood, strikesIran-specific keywords use trailing boundary only (allow prefix matches):
attack iranuses(?![\w-])instead of\b..\b- Prevents hyphen breaks: “US-Iran tensions” still matches
Compiled regexes are cached in a
Map to avoid recompiling on every headline (10-15x performance improvement).src/services/threat-classifier.ts:286-315
Variant-Specific Keywords
The Tech Monitor variant includes additional keywords for tech industry threats: High:major outage,global outage,service downzero-day,critical vulnerability,supply chain attackmass layoff
outage,breach,hack,vulnerabilitylayoff,layoffs,antitrust,monopolyban,shutdown
ipo,funding,acquisition,mergerlaunch,release,update,partnershipstartup,ai model,open source
src/services/threat-classifier.ts:241-276
Stage 2: Browser-Side ML
Transformers.js runs Named Entity Recognition (NER), sentiment analysis, and topic classification entirely in the browser:Xenova/bert-base-NER— entity extractionXenova/distilbert-base-uncased-finetuned-sst-2-english— sentiment- Topic classification model (custom fine-tuned)
User control: “Browser Local Model” toggle in AI Flow settings. When disabled:
- ML worker is never initialized
- No ONNX model downloads
- No WebGL memory allocation
- Keyword classifier remains active
ML confidence is typically lower than LLM but higher than keyword-only classification.
src/services/ml-worker.ts
Stage 3: LLM Classifier
Headlines are collected into a batch queue and fired as parallelclassifyEvent RPCs:
Batching Configuration
Max headlines per batch.
Wait time before flushing partial batch (if fewer than 20 items).
Base delay between API requests to prevent rate limiting.
Random jitter (±200ms) added to stagger timing.
Minimum gap between requests enforced.
Failed jobs are retried up to 2 times before dropping.
Queue is capped at 100 items. Excess classifications are dropped with console warning.
Error Handling
429 Rate Limit
429 Rate Limit
- Batch queue pauses for 60 seconds
- Failed job increments attempt counter and is requeued (if attempts < MAX_RETRIES)
- Remaining jobs in batch are requeued WITHOUT burning attempts
- Console warning:
[Classify] 429 — pausing AI classification for 60s
500+ Server Error
500+ Server Error
- Batch queue pauses for 30 seconds
- Same retry logic as 429
- Prevents wasting API quota on transient failures
- Console warning:
[Classify] 500 — pausing AI classification for 30s
Network Error
Network Error
- Individual job fails (no queue pause)
- Job is retried up to MAX_RETRIES
- After max retries, returns
null(keyword classification remains)
src/services/threat-classifier.ts:412-495
LLM Provider Configuration
Redis Caching
LLM results are cached with 24h TTL to prevent redundant API calls:Classification Override Logic
When multiple sources provide results, the highest confidence wins:source tag (keyword, ml, llm) so downstream consumers can weight confidence accordingly.
Aggregate Threat for Clusters
News clusters (multiple sources reporting same story) aggregate threat levels:src/services/threat-classifier.ts:521-570
Threat Color Mapping
Threat levels are color-coded with CSS variables for theme support:critical
Red
--threat-criticalhigh
Orange
--threat-highmedium
Yellow
--threat-mediumlow
Green
--threat-lowinfo
Blue
--threat-infogetThreatColor() instead of static THREAT_COLORS object to support light/dark theme switching.
Example Classifications
Key Files
src/services/threat-classifier.ts— Main classification enginesrc/services/ml-worker.ts— Browser-side Transformers.js MLapi/intelligence/classify-event.ts— LLM classification handlersrc/components/ThreatBadge.tsx— UI threat level indicators