Skip to main content

Arabic prefix normalization

normalizeArabicPrefixesToAl

Replaces common Arabic prefixes (like ‘Al-’, ‘Ar-’, ‘Ash-’, etc.) with ‘al-’ in the text. Handles different variations of prefixes such as Ash- and Al- but not when the second word does not start with ‘S’.
text
string
The input text containing Arabic prefixes
Returns: string - The modified text with standardized ‘al-’ prefixes.
normalizeArabicPrefixesToAl("Ash-Shafiee"); // "al-Shafiee"
normalizeArabicPrefixesToAl("Al-Bukhari"); // "al-Bukhari"

removeArabicPrefixes

Strips common Arabic prefixes like ‘al-’, ‘bi-’, ‘fī’, ‘wa-’, etc. from the beginning of words.
text
string
The input text containing Arabic prefixes
Returns: string - The modified text with prefixes stripped.
removeArabicPrefixes("al-Bukhari"); // "Bukhari"

Apostrophe handling

normalizeDoubleApostrophes

Removes double occurrences of Arabic apostrophes such as ʿʿ or ʾʾ in the text.
text
string
The input text containing double apostrophes
Returns: string - The modified text with condensed apostrophes.
normalizeDoubleApostrophes("ʿulamāʾʾ"); // "ʿulamāʾ"

Salutation replacement

replaceSalutationsWithSymbol

Replaces common salutations with the ﷺ symbol. Handles 130+ variations including:
  • Arabic script (with and without diacritics)
  • Latin transliterations (various romanization schemes)
  • Abbreviations (PBUH, SAWS, SAW, etc.)
  • English phrases (“peace and blessings be upon him”)
  • Parenthetical forms
text
string
The input text containing salutations
Returns: string - The modified text with salutations replaced by ﷺ.
replaceSalutationsWithSymbol("Muhammad (PBUH)"); // "Muhammad ﷺ"
replaceSalutationsWithSymbol("Prophet Muhammad peace be upon him"); // "Prophet Muhammad ﷺ"
replaceSalutationsWithSymbol("النبي صلى الله عليه وسلم"); // "النبي ﷺ"

Text normalization

normalize

Normalizes the text by removing diacritics, apostrophes, and dashes.
input
string
The input text to normalize
Returns: string - The normalized text.
normalize("Al-Jadwal"); // "AlJadwal"
normalize("āḍġḥīṣṭū"); // "adghistu"

normalizeTransliteratedEnglish

Simplifies English transliterations by removing diacritics, apostrophes, and common prefixes.
text
string
The input text to simplify
Returns: string - The simplified text.
normalizeTransliteratedEnglish("Al-Jadwal"); // "Jadwal"
normalizeTransliteratedEnglish("āḍġḥīṣṭū"); // "adghistu"

Initial extraction

extractInitials

Extracts the initials from the input string, typically used for names or titles.
fullName
string
The input text to extract initials from
Returns: string - The extracted initials.
extractInitials("Nayl al-Awtar"); // "NA"
extractInitials("Sahih al-Bukhari"); // "SB"

Build docs developers (and LLMs) love