Skip to main content

Line breaks and newlines

insertLineBreaksAfterPunctuation

Adds line breaks after punctuation marks such as periods, exclamation points, and question marks.
For the full preformatting pipeline in one pass (significantly faster and more memory-friendly on very large inputs), use preformatArabicText from the preformat module.
text
string
The input text containing punctuation
Returns: string - The modified text with line breaks added after punctuation.
insertLineBreaksAfterPunctuation("Text."); // "Text.\n"

cleanLiteralNewLines

Replaces literal new line characters (\n) and carriage returns (\r) with actual line breaks.
text
string
The input text containing literal new lines
Returns: string - The modified text with actual line breaks.
cleanLiteralNewLines("A\\nB"); // "A\nB"

cleanMultilines

Removes horizontal whitespace (spaces, tabs, non-breaking spaces) from the beginning and end of each line, while preserving line breaks (\n and \r). Handles various types of horizontal whitespace:
  • Regular spaces (U+0020)
  • Tabs (U+0009)
  • Non-breaking spaces (U+00A0)
  • Other Unicode horizontal whitespace characters
text
string
The input text to apply the rule to
Returns: string - The modified text with horizontal whitespace trimmed from each line.
cleanMultilines("  line1  \n  line2  "); // "line1\nline2"
cleanMultilines("\t\tindented\t\t"); // "indented"
cleanMultilines("text\n \n \n"); // "text\n\n\n"

reduceMultilineBreaksToDouble

Reduces multiple consecutive line breaks (3 or more) to exactly 2 line breaks.
text
string
The input text to apply the rule to
Returns: string - The modified text with condensed line breaks.
reduceMultilineBreaksToDouble("Line 1\n\n\n\nLine 2"); // "Line 1\n\nLine 2"

reduceMultilineBreaksToSingle

Reduces multiple consecutive line breaks (2 or more) to exactly 1 line break.
text
string
The input text to apply the rule to
Returns: string - The modified text with condensed line breaks.
reduceMultilineBreaksToSingle("Line 1\n\nLine 2"); // "Line 1\nLine 2"

Punctuation and spacing

addSpaceBeforeAndAfterPunctuation

Adds spaces before and after punctuation, except for certain cases like quoted text or ayah references.
text
string
The input text containing punctuation
Returns: string - The modified text with spaces added before and after punctuation.
addSpaceBeforeAndAfterPunctuation("Text,word"); // "Text, word"

cleanSpacesBeforePeriod

Cleans unnecessary spaces before punctuation marks such as periods, commas, and question marks.
text
string
The input text to apply the rule to
Returns: string - The modified text with cleaned spaces before punctuation.
cleanSpacesBeforePeriod("This is a sentence , with extra space .");
// "This is a sentence, with extra space."

removeRedundantPunctuation

Removes redundant punctuation marks that follow Arabic question marks or exclamation marks. This function cleans up text by removing periods (.) or Arabic commas (،) that immediately follow Arabic question marks (؟) or exclamation marks (!), as they are considered redundant in proper Arabic punctuation.
text
string
The Arabic text to clean up
Returns: string - The text with redundant punctuation removed.
removeRedundantPunctuation("كيف حالك؟."); // "كيف حالك؟"
removeRedundantPunctuation("ممتاز!،"); // "ممتاز!"
removeRedundantPunctuation("هذا جيد."); // "هذا جيد." (unchanged)

normalizeSpaces

Reduces multiple spaces or tabs to a single space.
text
string
The input text containing extra spaces
Returns: string - The modified text with reduced spaces.
normalizeSpaces("This   is a   text"); // "This is a text"

normalizeSlashInReferences

Removes unnecessary spaces around slashes in references.
text
string
The input text containing references
Returns: string - The modified text with spaces removed around slashes.
normalizeSlashInReferences("127 / 11"); // "127/11"

Condensing and cleaning

condenseAsterisks

Condenses multiple asterisks (*) into a single one.
text
string
The input text to apply the rule to
Returns: string - The modified text with condensed asterisks.
condenseAsterisks("***"); // "*"

condenseColons

Replaces occurrences of colons surrounded by periods (e.g., ’.:.’ or ’:’) with a single colon.
text
string
The input text to apply the rule to
Returns: string - The modified text with condensed colons.
condenseColons("This.:. is a test"); // "This: is a test"

condenseDashes

Condenses two or more dashes (—) into a single dash (-).
text
string
The input text to apply the rule to
Returns: string - The modified text with condensed dashes.
condenseDashes("This is some ---- text"); // "This is some - text"

condenseEllipsis

Replaces sequences of two or more periods (e.g., ’…’) with an ellipsis character (…).
text
string
The input text to apply the rule to
Returns: string - The modified text with ellipses condensed.
condenseEllipsis("This is a test..."); // "This is a test…"

condensePeriods

Condenses multiple periods separated by spaces (e.g., ’…’) into a single period.
text
string
The input text to apply the rule to
Returns: string - The modified text with condensed periods.
condensePeriods("This . . . is a test"); // "This. is a test"

condenseUnderscores

Condenses multiple underscores (__) or Arabic Tatweel characters (ـــــ) into a single underscore or Tatweel.
text
string
The input text to apply the rule to
Returns: string - The modified text with condensed underscores.
condenseUnderscores("This is ـــ some text __"); // "This is ـ some text _"

Quotes and brackets

applySmartQuotes

Turns regular double quotes surrounding a body of text into smart quotes. Also fixes incorrect starting quotes by ensuring the string starts with an opening quote if needed.
text
string
The input text to apply the rule to
Returns: string - The modified text with smart quotes applied.
applySmartQuotes('The "quick brown" fox'); // 'The "quick brown" fox'

doubleToSingleBrackets

Replaces double parentheses or brackets with single ones.
text
string
The input text to apply the rule to
Returns: string - The modified text with condensed brackets.
doubleToSingleBrackets("((text))"); // "(text)"

ensureSpaceBeforeBrackets

Ensures at most 1 space exists before any word before brackets. Adds a space if there isn’t one, or reduces multiple spaces to one.
text
string
The input text to modify
Returns: string - The modified text with proper spacing before brackets.
ensureSpaceBeforeBrackets("word(content)"); // "word (content)"

ensureSpaceBeforeQuotes

Ensures at most 1 space exists before any word before Arabic quotation marks. Adds a space if there isn’t one, or reduces multiple spaces to one.
text
string
The input text to modify
Returns: string - The modified text with proper spacing before Arabic quotes.
ensureSpaceBeforeQuotes("word«content»"); // "word «content»"

fixBracketTypos

Fixes common bracket and quotation mark typos in text. Corrects malformed patterns like ”(«”, ”»)”, and misplaced digits in brackets.
text
string
Input text that may contain bracket typos
Returns: string - Text with corrected bracket and quotation mark combinations.
fixBracketTypos("(«text»)"); // "«text»"
fixBracketTypos(")123)"); // "(123)"

fixCurlyBraces

Fixes mismatched curly braces by converting incorrect bracket/brace combinations to proper curly braces .
text
string
Input text that may contain mismatched curly braces
Returns: string - Text with corrected curly brace pairs.
fixCurlyBraces("(content}"); // "{content}"
fixCurlyBraces("{content)"); // "{content}"

fixMismatchedQuotationMarks

Fixes mismatched quotation marks in Arabic text by converting various incorrect bracket/quote combinations to proper Arabic quotation marks (« »).
text
string
Input text that may contain mismatched quotation marks
Returns: string - Text with corrected Arabic quotation marks.
fixMismatchedQuotationMarks("«text)"); // "«text»"
fixMismatchedQuotationMarks("(text»"); // "«text»"

removeSpaceInsideBrackets

Removes spaces inside brackets, parentheses, or square brackets.
text
string
The input text with spaces inside brackets
Returns: string - The modified text with spaces removed inside brackets.
removeSpaceInsideBrackets("( a b )"); // "(a b)"

replaceDoubleBracketsWithArrows

Replaces double parentheses with Arabic quotation marks (arrows).
text
string
The input text to apply the rule to
Returns: string - The modified text with condensed brackets.
replaceDoubleBracketsWithArrows("((text))"); // "«text»"

trimSpaceInsideQuotes

Removes unnecessary spaces inside quotes.
text
string
The input text with spaces inside quotes
Returns: string - The modified text with spaces removed inside quotes.
trimSpaceInsideQuotes('" Text "'); // '"Text"'

Text detection and analysis

hasWordInSingleLine

Detects if a word is by itself in a line.
text
string
The text to check
Returns: boolean - True if there exists a word in any of the lines in the text that is by itself.

isOnlyPunctuation

Checks if the input string consists of only punctuation characters.
text
string
The input text to check
Returns: boolean - Returns true if the string contains only punctuation, false otherwise.

isAllUppercase

Detects if text is entirely in uppercase letters.
text
string
The text to check
Returns: boolean - True if all alphabetic characters are uppercase, false otherwise.
isAllUppercase("HELLO WORLD"); // true
isAllUppercase("Hello World"); // false
isAllUppercase("123"); // false (no letters)

Sentence and paragraph formatting

formatStringBySentence

Formats a multiline string by joining sentences and maintaining footnotes on their own lines. Footnotes are identified by Arabic and English numerals.
input
string
The input text containing sentences and footnotes
Returns: string - The formatted text.
formatStringBySentence("Sentence one.\n(1) A footnote.\nSentence two.");
// Maintains footnotes on their own lines while joining regular sentences

Styling and case conversion

stripBoldStyling

Removes bold styling from text by normalizing the string and removing stylistic characters.
text
string
The input text containing bold characters
Returns: string - The modified text with bold styling removed.

stripItalicsStyling

Removes italicized characters by replacing italic Unicode characters with their normal counterparts.
text
string
The input text containing italicized characters
Returns: string - The modified text with italics removed.
stripItalicsStyling("𝘼𝘽𝘾"); // "ABC"

stripStyling

Removes all bold and italic styling from the input text.
text
string
The input text to remove styling from
Returns: string - The modified text with all styling removed.

toTitleCase

Converts a string to title case (first letter of each word capitalized).
str
string
The input string to convert
Returns: string - String with each word’s first letter capitalized.
toTitleCase("hello world"); // "Hello World"
toTitleCase("the quick brown fox"); // "The Quick Brown Fox"

Build docs developers (and LLMs) love