Skip to main content
The media catalog maps short abbreviations (sigla) to full outlet or program names (nombre). The application uses this catalog to identify the source of each press clipping message and display human-readable names in the curation view and dispatched digests. Like keywords, the catalog is loaded from the Google Spreadsheet at import time — no code changes are required to add or update media outlets.

Data structure

After loading, the catalog is stored in the mediosYProgramas array:
mediosYProgramas: Array<{ sigla: string; nombre: string }>
FieldTypeDescription
siglastringShort abbreviation for the outlet or program (e.g. LV4, NM, C5N)
nombrestringFull name of the outlet or program (e.g. La Voz 4, Noticias Mediodía, Canal 5 Noticias)
The catalog is populated from columns D and E of the spreadsheet. Rows where either column D or column E is empty are filtered out:
importar.component.ts
this.mediosYProgramas = jsonData.table.rows
  .filter((mp: any) => mp.c[3] && mp.c[3].v && mp.c[4] && mp.c[4].v)
  .map((mp: any) => ({ sigla: mp.c[3].v, nombre: mp.c[4].v }));
this.mediosYProgramas.splice(0, 1); // remove header row

How abbreviations are resolved

The extractMediaAndProgram() method applies three resolution strategies in order. The first strategy that produces a match wins.

Strategy 1 — Dot-separated format

Messages from structured sources follow the pattern MEDIO.PROGRAMA. text..., where both MEDIO and PROGRAMA are abbreviations. The method splits on dots and looks up both segments in the catalog.
const puntoRegex = /^(.+?)\.(.+?)\./;
const puntoMatch = body.match(puntoRegex);
Example:
LV4.NM. Córdoba registró hoy un nuevo récord de temperatura...
Resolution:
SegmentSiglaResolved name
LV4outletLa Voz 4
NMprogramNoticias Mediodía
If the sigla is found in mediosYProgramas, the full nombre is used. If it is not found, the raw sigla is kept as-is.

Strategy 2 — URL-based

If the message body contains https://, the method extracts the domain and matches it against the catalog.
if (body.includes('https://')) {
  let url = body.split('//')[1];
  url = url.replace('www.', '');
  // matches url.split('.')[0] against catalog
  return { media: mediaObj ? mediaObj.nombre : url.split('.')[0], program: 'web' };
}
The www. prefix is stripped before matching. The program is always set to web for URL-based sources. Example: a message containing https://lavoz.com.ar/nota/... resolves to:
FieldValue
mediaLa Voz (if lavoz is in the catalog) or lavoz (if not)
programweb

Strategy 3 — Acronym prefix

If neither of the first two strategies matches, the method looks for 1–3 uppercase letters at the start of the message followed by a space and another uppercase word:
const mediaProgramRegex = /^([A-ZÁÉÍÓÚÑÜ]{1,3})\s([A-ZÁÉÍÓÚÑÜ]{1,3})\b/;
This handles informal messages that lead with a broadcast station callsign followed by a program abbreviation.

Fallback

If none of the three strategies produce a match, both media and program are set to *:
return { media: '*', program: '*' };
Items with * values are still processed and can be reviewed by operators in the curation view.

Adding a new media outlet or program

1

Open the master spreadsheet

Open the Google Spreadsheet configured in shared.service.ts. See Google Sheets integration for the spreadsheet ID.
2

Locate or add a row for the outlet

Find an existing row for the outlet if you are adding a new program for an existing outlet, or scroll to the first empty row to add a new entry.
3

Enter the abbreviation in column D

Type the sigla exactly as it appears in press clipping messages. Matching is performed by string equality against the extracted prefix — casing must match.
Column D: LV4
4

Enter the full name in column E

Type the full display name as it should appear in digests and the curation view.
Column E: La Voz 4
5

Save and test

Perform a test import with a log file that contains a message from the new outlet. Verify that the outlet name appears correctly in the parsed results.
The same row can contain both keyword data (columns A–B) and media catalog data (columns D–E). Rows that have a sigla and nombre but no keyword are valid — they contribute only to the catalog.

Behavior for unrecognized outlets

When the extracted sigla or domain does not match any entry in mediosYProgramas:
StrategyUnrecognized behavior
Dot-separatedmedia is set to the raw sigla; program is set to the raw program abbreviation
URL-basedmedia is set to the first segment of the domain (e.g. lavoz); program is set to web
Acronym prefixmedia and program are set to the matched uppercase tokens as-is
FallbackBoth media and program are set to *
Review items where media or program equals * after each import. If the same unrecognized sigla appears repeatedly, add it to the spreadsheet to improve resolution quality.

Examples

Raw message startStrategy usedResolved mediaResolved program
LV4.NM. Texto...Dot-separatedLa Voz 4Noticias Mediodía
C5N.INFO. Texto...Dot-separatedCanal 5 NoticiasInformativo
https://lavoz.com.ar/nota/...URL-basedLa Vozweb
https://www.infobae.com/...URL-basedInfobaeweb
LV4 NM texto sin punto...Acronym prefixLV4NM
(mensaje sin estructura)Fallback**

Troubleshooting

The sigla in the message does not match the sigla in column D of the spreadsheet exactly. Check for:
  • Trailing spaces in the spreadsheet cell
  • Case differences (the catalog lookup is case-sensitive for dot-separated and acronym strategies)
  • The outlet row being above the header row (and therefore discarded by .splice(0, 1))
The domain prefix (e.g. lavoz) is not present as a sigla in column D. Add a row with lavoz in column D and the full outlet name in column E.
The message format does not match any of the three strategies. Log the raw message body from extractMediaAndProgram() to inspect the actual format, then determine which strategy applies and whether the catalog entry is missing.

Build docs developers (and LLMs) love