Skip to main content
After parsing, each news item is passed through the categorization engine. The engine compares the item text against a keyword list sourced from Google Sheets and assigns one or more category labels corresponding to the relevant government areas.

Keyword source

Keywords are loaded from a Google Sheets document into the palabrasClaveOriginal array. Each entry has two fields:
FieldTypeDescription
palabrastringThe keyword or phrase to match against item text
padrestringSemicolon-separated list of government area names that own this keyword
A single keyword can belong to multiple areas by separating the area names with ; in the padre field:
palabre: "hospital"  →  padre: "Salud;Infraestructura"
The keyword list is managed centrally in Google Sheets. Changes made there are reflected in the app on next load without requiring a code deployment.

determineTopic() method

The core categorization logic lives in determineTopic(). It builds an index of topics to keyword arrays, then scans the lowercased item text for each keyword:
private determineTopic(text: string) {
  const lowerCaseText = text.toLowerCase();
  const topics: { [key: string]: string[] } = {};

  this.palabrasClaveOriginal.forEach(({ palabra, padre }) => {
    const padres = padre.split(';').map((p) => p.trim()).filter((p) => p);
    padres.forEach((p) => {
      if (!topics[p]) topics[p] = [];
      topics[p].push(palabra.toLowerCase());
    });
  });

  const foundTopics: string[] = [];
  for (const [topic, keywords] of Object.entries(topics)) {
    for (const keyword of keywords) {
      if (lowerCaseText.includes(keyword)) {
        foundTopics.push(topic);
        break; // one match per topic is sufficient
      }
    }
  }

  return foundTopics.length > 0 ? foundTopics : ['General'];
}

How the matching works

  1. The palabrasClaveOriginal array is iterated to build a topics map where each key is a government area name and each value is the list of keywords associated with that area.
  2. The method lowercases both the item text and each keyword before comparison, making all matching case-insensitive.
  3. For each topic, matching stops as soon as a single keyword is found (the inner break). This means a topic is either present or absent — the number of matching keywords within a topic does not affect the result.
  4. All topics that have at least one keyword match are collected into foundTopics.

Multi-category support

determineTopic() returns an array, not a single string. A news item can belong to multiple government areas simultaneously if its text contains keywords from several topics. For example, a story about a hospital construction contract might match both the Salud area (via hospital) and the Infraestructura area (via licitación). It will appear in both category panels and be included in dispatches to both sets of officials.

Case and accent handling

Both the item text and keywords are lowercased before comparison using .toLowerCase(). However, diacritics (accent marks) are not normalized — the matching uses exact substring search on the lowercased strings. This means córdoba and cordoba are treated as distinct strings. Keywords in Google Sheets should use the exact accented form that appears in press clipping messages.
If a keyword is missing expected matches, verify whether the keyword in Google Sheets uses the same accented form as the text in the logs. For example, if press clippings always write Córdoba (with accent), the keyword must also be stored as córdoba.

”General” fallback

If no keyword produces a match, determineTopic() returns ['General']. Items in the General category are visible in the General panel and serve as a catch-all for stories that do not map to a specific government area.

”Destacadas” flag

Any item can be manually flagged as Destacadas (featured) during the curation step. The flag is stored as a boolean on the news item object. Featured items are surfaced prominently in category panels and are included in the featured digest sent to senior officials.
The Destacadas flag is independent of category assignment. A featured item can also belong to one or more specific government areas.

Grouping and ungrouping items

Two utility methods handle display grouping:
MethodPurpose
agruparNoticias()Groups the flat array of news items by category, returning a map of category name → items array. Used to populate the category panels.
desagruparNoticias()Flattens the grouped map back into a single array. Used when saving changes after curation so that the full dataset is preserved.

Parsing

How raw WhatsApp messages become structured news item objects.

Curation

How operators review, edit, and refine categorized items.

Build docs developers (and LLMs) love