Arabic prefix normalization
normalizeArabicPrefixesToAl
Replaces common Arabic prefixes (like ‘Al-’, ‘Ar-’, ‘Ash-’, etc.) with ‘al-’ in the text. Handles different variations of prefixes such as Ash- and Al- but not when the second word does not start with ‘S’.The input text containing Arabic prefixes
string - The modified text with standardized ‘al-’ prefixes.
removeArabicPrefixes
Strips common Arabic prefixes like ‘al-’, ‘bi-’, ‘fī’, ‘wa-’, etc. from the beginning of words.The input text containing Arabic prefixes
string - The modified text with prefixes stripped.
Apostrophe handling
normalizeDoubleApostrophes
Removes double occurrences of Arabic apostrophes such as ʿʿ or ʾʾ in the text.The input text containing double apostrophes
string - The modified text with condensed apostrophes.
Salutation replacement
replaceSalutationsWithSymbol
Replaces common salutations with the ﷺ symbol. Handles 130+ variations including:- Arabic script (with and without diacritics)
- Latin transliterations (various romanization schemes)
- Abbreviations (PBUH, SAWS, SAW, etc.)
- English phrases (“peace and blessings be upon him”)
- Parenthetical forms
The input text containing salutations
string - The modified text with salutations replaced by ﷺ.
Text normalization
normalize
Normalizes the text by removing diacritics, apostrophes, and dashes.The input text to normalize
string - The normalized text.
normalizeTransliteratedEnglish
Simplifies English transliterations by removing diacritics, apostrophes, and common prefixes.The input text to simplify
string - The simplified text.
Initial extraction
extractInitials
Extracts the initials from the input string, typically used for names or titles.The input text to extract initials from
string - The extracted initials.