Skip to main content

Overview

World Monitor integrates AI-powered analysis throughout the platform using a 4-tier provider fallback chain that prioritizes local compute and gracefully degrades through cloud APIs.
Privacy-First Design: Local LLM support (Ollama/LM Studio) means intelligence analysis can run entirely on your hardware with zero data leaving your machine.

AI Summarization Chain

The World Brief and country briefs use a cascading provider system:
┌─────────────────────────────────────────────────────────────────┐
│                   Summarization Request                        │
│  (headlines deduplicated by Jaccard similarity > 0.6)          │
└───────────────────────┬─────────────────────────────────────────┘


┌─────────────────────────────────┐    timeout/error
│  Tier 1: Ollama / LM Studio    │──────────────┐
│  Local endpoint, no cloud       │               │
│  Auto-discovered model          │               │
└─────────────────────────────────┘               │

                                   ┌─────────────────────────────┐    timeout/error
                                   │  Tier 2: Groq               │──────────────┐
                                   │  Llama 3.1 8B, temp 0.3     │               │
                                   │  Fast cloud inference        │               │
                                   └─────────────────────────────┘               │

                                                                  ┌─────────────────────────────┐    timeout/error
                                                                  │  Tier 3: OpenRouter          │──────────────┐
                                                                  │  Multi-model fallback        │               │
                                                                  └─────────────────────────────┘               │

                                                                                                 ┌──────────────────────────┐
                                                                                                 │  Tier 4: Browser T5      │
                                                                                                 │  Transformers.js (ONNX)  │
                                                                                                 │  No network required     │
                                                                                                 └──────────────────────────┘

Fallback Behavior

Ollama / LM Studio
  • Communicates via OpenAI-compatible /v1/chat/completions
  • Auto-discovers available models from local instance
  • Filters out embedding-only models
  • Default model: llama3.1:8b
Configuration (Desktop app):
Settings → LLMs tab → Ollama endpoint
Default: http://localhost:11434
Local inference is private by default - no API keys, no telemetry, no data leaves your machine.

Headline Deduplication

Before sending to any LLM, headlines are deduplicated:
// Word-overlap similarity (Jaccard index)
// Near-duplicates (>60% overlap) are merged
// Reduces prompt by 20-40%
// Prevents LLM from wasting tokens on repeated stories
Example:
  • Input: “Russian forces advance in Bakhmut” (Source A)
  • Input: “Russian troops push forward in Bakhmut region” (Source B)
  • Output: Single deduplicated headline

Redis Caching

All API-tier summaries are cached server-side:
// Cache key structure
const cacheKey = `summary:v3:${mode}:${variant}:${lang}:${hash}`;
// TTL: 24 hours
Benefits:
  • Same headlines viewed by 1,000 users → 1 LLM call
  • Instant results for cached queries
  • Reduced API costs
  • Better performance
The first user to view a news configuration triggers the LLM call. All subsequent viewers get instant cached results.

Variant-Aware Prompting

System prompts adapt to the active dashboard variant:
// From server/worldmonitor/news/v1/handler.ts
switch (variant) {
  case 'full':
    // Emphasize conflict escalation, diplomatic shifts
    break;
  case 'tech':
    // Focus on funding rounds, AI breakthroughs, product launches
    break;
  case 'finance':
    // Highlight market movements, central bank signals, trading
    break;
}

Language-Aware Output

When the UI language is non-English:
// System prompt includes:
"Generate the summary in ${lang} language."

// Supported languages:
// French, Spanish, German, Italian, Polish, Portuguese, Dutch,
// Swedish, Russian, Arabic, Chinese, Japanese, Turkish, Thai, Vietnamese
LLM translation enables cross-language intelligence gathering - read sources in one language, get summaries in another.

Local Model Discovery

The desktop app automatically discovers available Ollama/LM Studio models:
// Settings panel queries local endpoint:
// 1. Try Ollama native: /api/tags
// 2. Fall back to OpenAI-compatible: /v1/models
// 3. Filter out embedding models
// 4. Populate dropdown
Manual Fallback:
  • If discovery fails, text input appears
  • Enter model name directly
  • Example: llama3.1:8b, mistral:7b, codellama:13b

Threat Classification Pipeline

Every news item passes through a 3-stage hybrid classifier:
Instant Pattern Matching
  • ~120 threat keywords organized by severity:
    • Critical
    • High
    • Medium
    • Low
    • Info
  • 14 event categories:
    • conflict, protest, disaster, diplomatic, economic,
    • terrorism, cyber, health, environmental, military,
    • crime, infrastructure, tech, general
// Word-boundary regex prevents false positives
// "war" won't match "award"
// "ai" won't match "train"
Output:
{
  severity: 'high',
  category: 'conflict',
  confidence: 0.85,
  source: 'keyword'
}

UI Never Blocks

Classification uses progressive enhancement:
  1. News items render immediately with keyword classification
  2. ML results arrive within seconds, update UI
  3. LLM results arrive, override if more confident
  4. Each item shows source tag: keyword, ml, or llm
Users never see a blank screen waiting for AI. Keyword results are instant, AI refinements layer on progressively.

Country Brief AI Analysis

Clicking any country opens a full intelligence dossier with AI-generated analysis:
// From country brief logic
const analysis = await summarizeArticle({
  headlines: countryNews,
  geoContext: countryName,
  mode: 'country_brief',
  variant: SITE_VARIANT,
  lang: currentLanguage
});
AI Analysis Includes:
  • Situation summary (2-3 paragraphs)
  • Key developments
  • Risk assessment
  • Inline citation anchors [1][8] that scroll to sources

Focal Point Detection

Correlates entities across multiple data streams:
// From src/services/focal-point-detector.ts
// Identifies convergence zones:
// - News mentions
// - Military activity
// - Protests
// - Outages
// - Markets
When 3+ signals converge in same geographic area → Focal Point Alert
// 2-hour rolling window vs 7-day baseline
// Flags surging terms across RSS feeds
// CVE/APT entity extraction
// Auto-summarization of trending topics
Spike Classification:
  • 2x baseline: Minor spike
  • 5x baseline: Major spike
  • 10x baseline: Viral spike

Performance Optimizations

Timeout Cascade

Each tier has a 5-second timeout:
// If Ollama takes >5s, automatically try Groq
// If Groq takes >5s, automatically try OpenRouter
// If OpenRouter takes >5s, fall back to Browser T5
Total worst-case: 20 seconds before Browser T5 renders Typical: 0-2 seconds (cached or fast LLM)

Circuit Breaker

// From src/services/summarization.ts
const summaryBreaker = createCircuitBreaker({
  name: 'News Summarization',
  cacheTtlMs: 0
});
Prevents cascading failures:
  • Tracks error rates per provider
  • Opens circuit after repeated failures
  • Skips to next tier immediately

Desktop App Settings

Settings window (Cmd+,) has dedicated LLMs tab:
Settings → LLMs
├── Ollama Endpoint (e.g., http://localhost:11434)
├── Model Selection (auto-discovered dropdown)
├── Groq API Key
└── OpenRouter API Key
Cross-Window Secret Sync:
  • Saving in Settings writes to OS keychain
  • Broadcasts localStorage change event
  • Main window hot-reloads secrets
  • No app restart required

API Key Storage

OS Keychain Integration
  • macOS: Keychain Access
  • Windows: Credential Manager
  • Linux: Secret Service API
All secrets stored in single JSON blob:
Keychain entry: secrets-vault
Reduces authorization prompts:
  • Old: 20+ prompts (one per key)
  • New: 1 prompt per launch

Browser-Side ML Worker

The ML worker runs in a separate Web Worker:
// From src/workers/ml.worker.ts
import { pipeline } from '@xenova/transformers';

// Tasks:
// - NER (Named Entity Recognition)
// - Sentiment analysis
// - Summarization (T5)
Memory Management:
  • Toggle in AI Flow settings
  • When disabled: Worker never initializes
  • When enabled mid-session: Initializes immediately
  • When disabled: Terminates worker
Disabling the browser model saves ~200MB of WebGL memory and eliminates ONNX model downloads.

Best Practices

For Maximum Privacy
  1. Install Ollama on your machine
  2. Pull a model: ollama pull llama3.1:8b
  3. Configure endpoint in Settings → LLMs
  4. Disable Groq and OpenRouter toggles
  5. All analysis runs locally
For Best Performance
  1. Use Groq (fastest cloud API)
  2. Keep browser ML enabled (instant NER)
  3. Ollama as backup for when offline
  4. OpenRouter for model variety

Troubleshooting

Ollama not connecting?
  • Verify Ollama is running: ollama serve
  • Check endpoint: http://localhost:11434
  • Test models available: ollama list
  • Check CORS (desktop app handles automatically)
Summaries always using Browser T5?
  • Verify API keys are configured
  • Check provider toggles enabled
  • Look for errors in browser console
  • Confirm internet connectivity (for cloud APIs)
Slow summarization?
  • First request triggers LLM (slow)
  • Subsequent requests instant (cached)
  • Consider local Ollama for consistent speed
  • Browser T5 is slowest but always works
  • Live News - AI classifies and summarizes news
  • Desktop App - Local LLM integration
  • Data Layers - AI enhances geographic correlation

Build docs developers (and LLMs) love