AI-powered topic relevance scoring system that prioritizes study content based on exam board statistics and fuzzy matching algorithms
The Relevance Engine is the core intelligence system that analyzes exam topics from editais (official exam syllabi) and assigns priority scores based on historical data from specific exam boards (bancas). It combines incidence statistics with confidence matching to generate P1/P2/P3 priority classifications.
Edital Ranking - Sorts all topics by final relevance score
The engine uses a combination of exact matching, partial inclusion, and NLP-based fuzzy matching to handle variations in topic naming across different exam boards.
The calculateFinalRelevance function is the heart of the scoring system. It takes an exam topic and returns a priority classification with justification.
The incidence score represents how frequently this topic appears in exams from the specific board:
// From weight (0-100 or 0-1)if (ht.weight !== undefined) { incidenceScore = ht.weight; if (incidenceScore > 1) incidenceScore = incidenceScore / 100;}// From rank (1 = highest priority)else if (ht.rank !== undefined) { // Rank #1 = peso máximo, decai progressivamente incidenceScore = 1 / Math.pow(ht.rank, 0.7);}// From categorical levelelse if (ht.level !== undefined) { incidenceScore = ht.level === 'ALTA' ? 1.0 : (ht.level === 'MEDIA' ? 0.6 : 0.3);}
Why use power of 0.7 for rank decay?
The exponent 0.7 creates a logarithmic decay curve that prevents lower-ranked items from dropping to zero too quickly. This ensures that even rank #10 topics maintain meaningful relevance:
Rank 1: score = 1.00
Rank 2: score = 0.62
Rank 5: score = 0.33
Rank 10: score = 0.20
This models the real-world distribution where top topics are critical but secondary topics still matter.
The match score reflects how confident the system is that the edital topic matches the banca topic. This is calculated by the findBestMatch function (see NLP & Fuzzy Match).
// Final score combines both factorslet finalScore = incidenceScore * matchStruct.score;
Critical: If the string match is only 50% confident, the final score is cut in half - even if the topic has 100% incidence. This prevents false positives from inflating priorities.
P1 (Priority 1): Topics with ≥75% score - highest incidence and confident match. Study these first.P2 (Priority 2): Topics with 40-74% score - moderate importance or uncertain matches. Study after P1.P3 (Priority 3): Topics with less than 40% score - low incidence or poor match confidence. Study if time permits.
After global ranking, priorities are reassigned based on position:
Top 20% → P1 (highest priority topics across all disciplines)
Next 40% (20-60%) → P2
Bottom 40% (60-100%) → P3
Why recompute priorities after sorting?
Individual topic scores are relative to their banca data, but study priorities should be relative to the entire edital. A topic might score 0.80 individually, but if 25% of topics score higher, it becomes P2 in context.This ensures the P1 classification truly represents the most critical topics for this specific exam.
The system supports manual corrections through userMappings - allowing students to fix incorrect matches:
const userMap = state.bancaRelevance?.userMappings?.[editalSubjectName];// User marked as "no incidence"if (userMap === 'NONE') { return { matchedItem: null, score: 0.05, confidence: 'HIGH', reason: 'Marcado como sem incidência pelo usuário' };}// User manually linked to specific banca topicconst forcedMatch = hotTopics.find(h => h.id === userMap);if (forcedMatch) { return { matchedItem: forcedMatch, score: 1.0, confidence: 'HIGH', reason: 'Mapeamento Fixado Manualmente' };}
Manual mappings always take precedence over algorithmic matching. This is critical for exam-specific terminology variations that the fuzzy matcher might miss.