AI Intelligence

Overview

World Monitor integrates AI-powered analysis throughout the platform using a 4-tier provider fallback chain that prioritizes local compute and gracefully degrades through cloud APIs.

Privacy-First Design: Local LLM support (Ollama/LM Studio) means intelligence analysis can run entirely on your hardware with zero data leaving your machine.

AI Summarization Chain

The World Brief and country briefs use a cascading provider system:

┌─────────────────────────────────────────────────────────────────┐
│                   Summarization Request                        │
│  (headlines deduplicated by Jaccard similarity > 0.6)          │
└───────────────────────┬─────────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────┐    timeout/error
│  Tier 1: Ollama / LM Studio    │──────────────┐
│  Local endpoint, no cloud       │               │
│  Auto-discovered model          │               │
└─────────────────────────────────┘               │
                                                  ▼
                                   ┌─────────────────────────────┐    timeout/error
                                   │  Tier 2: Groq               │──────────────┐
                                   │  Llama 3.1 8B, temp 0.3     │               │
                                   │  Fast cloud inference        │               │
                                   └─────────────────────────────┘               │
                                                                                 ▼
                                                                  ┌─────────────────────────────┐    timeout/error
                                                                  │  Tier 3: OpenRouter          │──────────────┐
                                                                  │  Multi-model fallback        │               │
                                                                  └─────────────────────────────┘               │
                                                                                                                ▼
                                                                                                 ┌──────────────────────────┐
                                                                                                 │  Tier 4: Browser T5      │
                                                                                                 │  Transformers.js (ONNX)  │
                                                                                                 │  No network required     │
                                                                                                 └──────────────────────────┘

Fallback Behavior

Tier 1: Local LLM
Tier 2: Groq
Tier 3: OpenRouter
Tier 4: Browser T5

Ollama / LM Studio

Communicates via OpenAI-compatible /v1/chat/completions
Auto-discovers available models from local instance
Filters out embedding-only models
Default model: llama3.1:8b

Configuration (Desktop app):

Settings → LLMs tab → Ollama endpoint
Default: http://localhost:11434

Local inference is private by default - no API keys, no telemetry, no data leaves your machine.

Transformers.js (ONNX)

Runs entirely in browser via WebAssembly
Model: T5-small (60M parameters)
No network required after initial download
Automatic fallback when all APIs fail

// From src/services/summarization.ts
const combinedText = headlines.slice(0, 5).map(h => h.slice(0, 80)).join('. ');
const prompt = `Summarize the most important headline in 2 concise sentences`;
const [summary] = await mlWorker.summarize([prompt]);

Browser T5 ensures the dashboard always produces some analysis, even without any API keys configured.

Headline Deduplication

Before sending to any LLM, headlines are deduplicated:

// Word-overlap similarity (Jaccard index)
// Near-duplicates (>60% overlap) are merged
// Reduces prompt by 20-40%
// Prevents LLM from wasting tokens on repeated stories

Example:

Input: “Russian forces advance in Bakhmut” (Source A)
Input: “Russian troops push forward in Bakhmut region” (Source B)
Output: Single deduplicated headline

Redis Caching

All API-tier summaries are cached server-side:

// Cache key structure
const cacheKey = `summary:v3:${mode}:${variant}:${lang}:${hash}`;
// TTL: 24 hours

Benefits:

Same headlines viewed by 1,000 users → 1 LLM call
Instant results for cached queries
Reduced API costs
Better performance

The first user to view a news configuration triggers the LLM call. All subsequent viewers get instant cached results.

Variant-Aware Prompting

System prompts adapt to the active dashboard variant:

// From server/worldmonitor/news/v1/handler.ts
switch (variant) {
  case 'full':
    // Emphasize conflict escalation, diplomatic shifts
    break;
  case 'tech':
    // Focus on funding rounds, AI breakthroughs, product launches
    break;
  case 'finance':
    // Highlight market movements, central bank signals, trading
    break;
}

Language-Aware Output

When the UI language is non-English:

// System prompt includes:
"Generate the summary in ${lang} language."

// Supported languages:
// French, Spanish, German, Italian, Polish, Portuguese, Dutch,
// Swedish, Russian, Arabic, Chinese, Japanese, Turkish, Thai, Vietnamese

LLM translation enables cross-language intelligence gathering - read sources in one language, get summaries in another.

Local Model Discovery

The desktop app automatically discovers available Ollama/LM Studio models:

// Settings panel queries local endpoint:
// 1. Try Ollama native: /api/tags
// 2. Fall back to OpenAI-compatible: /v1/models
// 3. Filter out embedding models
// 4. Populate dropdown

Manual Fallback:

If discovery fails, text input appears
Enter model name directly
Example: llama3.1:8b, mistral:7b, codellama:13b

Threat Classification Pipeline

Every news item passes through a 3-stage hybrid classifier:

Stage 1: Keyword
Stage 2: Browser ML
Stage 3: LLM Classifier

Instant Pattern Matching

~120 threat keywords organized by severity:
- Critical
- High
- Medium
- Low
- Info
14 event categories:
- conflict, protest, disaster, diplomatic, economic,
- terrorism, cyber, health, environmental, military,
- crime, infrastructure, tech, general

// Word-boundary regex prevents false positives
// "war" won't match "award"
// "ai" won't match "train"

Output:

{
  severity: 'high',
  category: 'conflict',
  confidence: 0.85,
  source: 'keyword'
}

Transformers.js NER + SentimentRuns asynchronously in Web Worker:

Named Entity Recognition
Sentiment analysis
Topic classification

No server dependency - all in-browser.

// From src/workers/ml.worker.ts
// Provides second opinion without API call

Controllable via “Browser Local Model” toggle:

Disabled: No ONNX download, no WebGL memory allocation
Enabled: Worker initializes immediately

High-Confidence Override

Headlines batched into queue
Parallel RPC calls to configured LLM
Provider: Groq Llama 3.1 8B (temp 0) or Ollama
Results cached in Redis (24h TTL)

// LLM result overrides keyword only if confidence higher
if (llmResult.confidence > keywordResult.confidence) {
  return llmResult;
}

Automatic Pause:

On 500-series errors, queue pauses
Exponential backoff prevents wasting API quota
Resumes when service recovers

UI Never Blocks

Classification uses progressive enhancement:

News items render immediately with keyword classification
ML results arrive within seconds, update UI
LLM results arrive, override if more confident
Each item shows source tag: keyword, ml, or llm

Users never see a blank screen waiting for AI. Keyword results are instant, AI refinements layer on progressively.

Country Brief AI Analysis

Clicking any country opens a full intelligence dossier with AI-generated analysis:

// From country brief logic
const analysis = await summarizeArticle({
  headlines: countryNews,
  geoContext: countryName,
  mode: 'country_brief',
  variant: SITE_VARIANT,
  lang: currentLanguage
});

AI Analysis Includes:

Situation summary (2-3 paragraphs)
Key developments
Risk assessment
Inline citation anchors [1]–[8] that scroll to sources

Focal Point Detection

Correlates entities across multiple data streams:

// From src/services/focal-point-detector.ts
// Identifies convergence zones:
// - News mentions
// - Military activity
// - Protests
// - Outages
// - Markets

When 3+ signals converge in same geographic area → Focal Point Alert

// 2-hour rolling window vs 7-day baseline
// Flags surging terms across RSS feeds
// CVE/APT entity extraction
// Auto-summarization of trending topics

Spike Classification:

2x baseline: Minor spike
5x baseline: Major spike
10x baseline: Viral spike

Performance Optimizations

Timeout Cascade

Each tier has a 5-second timeout:

// If Ollama takes >5s, automatically try Groq
// If Groq takes >5s, automatically try OpenRouter
// If OpenRouter takes >5s, fall back to Browser T5

Total worst-case: 20 seconds before Browser T5 renders Typical: 0-2 seconds (cached or fast LLM)

Circuit Breaker

// From src/services/summarization.ts
const summaryBreaker = createCircuitBreaker({
  name: 'News Summarization',
  cacheTtlMs: 0
});

Prevents cascading failures:

Tracks error rates per provider
Opens circuit after repeated failures
Skips to next tier immediately

Desktop App Settings

Settings window (Cmd+,) has dedicated LLMs tab:

Settings → LLMs
├── Ollama Endpoint (e.g., http://localhost:11434)
├── Model Selection (auto-discovered dropdown)
├── Groq API Key
└── OpenRouter API Key

Cross-Window Secret Sync:

Saving in Settings writes to OS keychain
Broadcasts localStorage change event
Main window hot-reloads secrets
No app restart required

API Key Storage

Desktop App
Web App

OS Keychain Integration

macOS: Keychain Access
Windows: Credential Manager
Linux: Secret Service API

All secrets stored in single JSON blob:

Keychain entry: secrets-vault

Reduces authorization prompts:

Old: 20+ prompts (one per key)
New: 1 prompt per launch

Browser-Side ML Worker

The ML worker runs in a separate Web Worker:

// From src/workers/ml.worker.ts
import { pipeline } from '@xenova/transformers';

// Tasks:
// - NER (Named Entity Recognition)
// - Sentiment analysis
// - Summarization (T5)

Memory Management:

Toggle in AI Flow settings
When disabled: Worker never initializes
When enabled mid-session: Initializes immediately
When disabled: Terminates worker

Disabling the browser model saves ~200MB of WebGL memory and eliminates ONNX model downloads.

Best Practices

For Maximum Privacy

Install Ollama on your machine
Pull a model: ollama pull llama3.1:8b
Configure endpoint in Settings → LLMs
Disable Groq and OpenRouter toggles
All analysis runs locally

For Best Performance

Use Groq (fastest cloud API)
Keep browser ML enabled (instant NER)
Ollama as backup for when offline
OpenRouter for model variety

Troubleshooting

Ollama not connecting?

Verify Ollama is running: ollama serve
Check endpoint: http://localhost:11434
Test models available: ollama list
Check CORS (desktop app handles automatically)

Summaries always using Browser T5?

Verify API keys are configured
Check provider toggles enabled
Look for errors in browser console
Confirm internet connectivity (for cloud APIs)

Slow summarization?

First request triggers LLM (slow)
Subsequent requests instant (cached)
Consider local Ollama for consistent speed
Browser T5 is slowest but always works

Live News - AI classifies and summarizes news
Desktop App - Local LLM integration
Data Layers - AI enhances geographic correlation

Get Started

Core Features

Data & Intelligence

Variants

Configuration

Development

Deployment

Overview

AI Summarization Chain

Fallback Behavior

Headline Deduplication

Redis Caching

Variant-Aware Prompting

Language-Aware Output

Local Model Discovery

Threat Classification Pipeline

UI Never Blocks

Country Brief AI Analysis

Focal Point Detection

Performance Optimizations

Timeout Cascade

Circuit Breaker

Desktop App Settings

API Key Storage

Browser-Side ML Worker

Best Practices

Troubleshooting

Build docs developers (and LLMs) love

Get Started

Core Features

Data & Intelligence

Variants

Configuration

Development

Deployment

​Overview

​AI Summarization Chain

​Fallback Behavior

​Headline Deduplication

​Redis Caching

​Variant-Aware Prompting

​Language-Aware Output

​Local Model Discovery

​Threat Classification Pipeline

​UI Never Blocks

​Country Brief AI Analysis

​Focal Point Detection

​Trending Keyword Spike Detection

​Performance Optimizations

​Timeout Cascade

​Circuit Breaker

​Desktop App Settings

​API Key Storage

​Browser-Side ML Worker

​Best Practices

​Troubleshooting

​Related Features

Build docs developers (and LLMs) love

Overview

AI Summarization Chain

Fallback Behavior

Headline Deduplication

Redis Caching

Variant-Aware Prompting

Language-Aware Output

Local Model Discovery

Threat Classification Pipeline

UI Never Blocks

Country Brief AI Analysis

Focal Point Detection

Trending Keyword Spike Detection

Performance Optimizations

Timeout Cascade

Circuit Breaker

Desktop App Settings

API Key Storage

Browser-Side ML Worker

Best Practices

Troubleshooting

Related Features