Skip to main content
MarkdownView uses swift-cmark — Apple’s Swift wrapper around the reference CommonMark implementation — to parse markdown into an Abstract Syntax Tree (AST). This ensures spec-compliant parsing with full GitHub Flavored Markdown (GFM) support.

Parser initialization

The MarkdownParser class creates a cmark parser with GFM extensions enabled:
Sources/MarkdownParser/MarkdownParser/MarkdownParser.swift
func withParser<T>(_ block: (UnsafeMutablePointer<cmark_parser>) -> T) -> T {
    let parser = cmark_parser_new(CMARK_OPT_DEFAULT)!
    cmark_gfm_core_extensions_ensure_registered()
    let extensionNames = [
        "autolink",
        "strikethrough",
        "tagfilter",
        "tasklist",
        "table",
    ]
    for extensionName in extensionNames {
        guard let syntaxExtension = cmark_find_syntax_extension(extensionName) else {
            assertionFailure()
            continue
        }
        cmark_parser_attach_syntax_extension(parser, syntaxExtension)
    }
    defer { cmark_parser_free(parser) }
    return block(parser)
}
The parser is created on-demand and freed after use to minimize memory footprint. This design allows parsing to be fully synchronous while keeping overhead low.

GFM extensions

MarkdownView enables five GitHub Flavored Markdown extensions:
Adds support for ~~deleted text~~ syntax:
~~This text is crossed out~~
Filters potentially dangerous HTML tags for security.
Enables checkbox-style task lists:
- [x] Completed task
- [ ] Incomplete task
Parses pipe tables with column alignment:
| Left | Center | Right |
|:-----|:------:|------:|
| A    | B      | C     |

Basic parsing

The parse(_:) method is the primary entry point:
Sources/MarkdownParser/MarkdownParser/MarkdownParser.swift
public func parse(_ markdown: String) -> ParseResult {
    let math = MathContext(preprocessText: markdown)
    math.process()
    let markdown = math.indexedContent ?? markdown
    let nodes = withParser { parser in
        markdown.withCString { str in
            cmark_parser_feed(parser, str, strlen(str))
            return cmark_parser_finish(parser)
        }
    }
    var blocks = dumpBlocks(root: nodes)
    blocks = finalizeMathBlocks(blocks, mathContext: math)
    return .init(document: blocks, mathContext: math.contents)
}
The parsing pipeline has three stages:
  1. Math preprocessing — Extract LaTeX expressions before cmark sees them
  2. Swift-cmark parsing — Convert markdown to cmark AST
  3. AST conversion — Transform cmark nodes into Swift types

Parse result

Sources/MarkdownParser/MarkdownParser/MarkdownParser.swift
public struct ParseResult {
    public let document: [MarkdownBlockNode]
    public let mathContext: [Int: String]
}
  • document: Array of top-level block nodes (headings, paragraphs, lists, etc.)
  • mathContext: Dictionary mapping replacement identifiers to original LaTeX strings

Math preprocessing

LaTeX expressions are extracted before cmark parsing to prevent markdown syntax inside math from being interpreted:
Sources/MarkdownParser/MarkdownParser/MarkdownParser+MathContext.swift
private let mathPattern: NSRegularExpression? = {
    let patterns = [
        ###"\$\$(\[\s\S]*?)\$\$"###, // Block: $$ ... $$
        ###"\\\\\[(\[\s\S]*?)\\\\\]"###, // Block: \\[ ... \\]
        ###"\\\\\((\[\s\S]*?)\\\\\)"###, // Inline: \\( ... \\)
        ###"\\\[ ([\s\S]*?) \\\]"###, // Block: \[ ... \]
        ###"\\\( ([^`\n]*?) \\\)"###, // Inline: \( ... \)
    ]
    let pattern = patterns.joined(separator: "|")
    return try? NSRegularExpression(pattern: pattern, options: [.caseInsensitive, .allowCommentsAndWhitespace])
}()

Preprocessing workflow

1

Find all math expressions

Use regex to match LaTeX delimiters in the input string.
2

Replace with identifiers

Substitute each math expression with a placeholder like `__MATH__0__`.
"The equation $E = mc^2$ is famous"
// becomes
"The equation `__MATH__0__` is famous"
3

Store original content

Save the LaTeX string in mathContext dictionary with the identifier as key.
mathContext[0] = "E = mc^2"
4

Parse modified markdown

Pass the placeholder-filled string to cmark. The backticks ensure math appears as inline code.
5

Restore math nodes

After parsing, convert inline code nodes with math identifiers back to .math() nodes.
See the full implementation in Sources/MarkdownParser/MarkdownParser/MarkdownParser+MathContext.swift:33.
This preprocessing approach ensures that markdown syntax characters inside LaTeX (like *, _, [, ]) don’t interfere with parsing.

Inline math detection

After the main parse, a second regex pass detects simple inline math that wasn’t caught in preprocessing:
Sources/MarkdownParser/MarkdownParser/MarkdownParser+MathContext.swift
private let mathPatternWithinBlock: NSRegularExpression? = {
    let patterns = [
        ###"\\\( ([^\r\n]+?) \\\)"###, // \(...\)
        ###"\$ ([^\r\n]+?) \$"###, // $ ... $
    ]
    let pattern = patterns.joined(separator: "|")
    return try? NSRegularExpression(pattern: pattern, options: [.caseInsensitive, .allowCommentsAndWhitespace])
}()
This catches inline expressions like $x^2$ and \( \alpha \) within text nodes.

AST conversion

The parser transforms cmark’s C-based AST into Swift enum types. Here’s how block nodes are extracted:
Sources/MarkdownParser/MarkdownParser/MarkdownParser+Node.swift
func dumpBlocks(root: UnsafeNode?) -> [MarkdownBlockNode] {
    guard let root else { return [] }
    var blocks: [MarkdownBlockNode] = []
    for node in root.children {
        guard let block = MarkdownBlockNode(unsafeNode: node) else { continue }
        blocks.append(block)
    }
    return blocks
}
Each cmark node type maps to a Swift enum case:
cmark typeSwift type
CMARK_NODE_HEADING.heading(level:content:)
CMARK_NODE_PARAGRAPH.paragraph(content:)
CMARK_NODE_BLOCK_QUOTE.blockquote(children:)
CMARK_NODE_CODE_BLOCK.codeBlock(fenceInfo:content:)
CMARK_NODE_LIST.bulletedList / .numberedList / .taskList
Extension: table.table(columnAlignments:rows:)
The AST conversion is recursive. Container blocks like blockquotes and list items contain child blocks, and paragraphs contain inline nodes.

Incremental parsing

For live editing and streaming scenarios, MarkdownView supports incremental parsing that reuses work from previous parses:
Sources/MarkdownParser/MarkdownParser/MarkdownParser.swift
public func parseIncremental(
    previousMarkdown: String,
    newMarkdown: String,
    previousBlocks: [MarkdownBlockNode],
    previousRanges: [RootBlockRange]? = nil
) -> IncrementalParseResult?

Incremental parsing strategy

1

Detect stable prefix

Compare previousMarkdown and newMarkdown to find the longest common prefix.
2

Reuse prefix blocks

Keep the AST nodes for the unchanged prefix without reparsing them.
3

Parse tail only

Parse the changed suffix starting from the stable boundary.
4

Merge results

Concatenate stable prefix blocks with newly parsed tail blocks.
let result = parser.parseIncremental(
    previousMarkdown: oldText,
    newMarkdown: newText,
    previousBlocks: oldBlocks
)

if let result = result {
    // Reuse first N blocks, append new tail
    let merged = oldBlocks.prefix(result.stablePrefixBlockCount) + result.tailResult.document
}
Incremental parsing provides a ~5ms speedup for 300-block documents when only the end is modified. See the performance benchmarks.

Block range tracking

The parser can report the source location of each root-level block:
Sources/MarkdownParser/MarkdownParser/MarkdownParser.swift
public struct RootBlockRange {
    public let type: MarkdownNodeType
    public let startIndex: String.Index
    public let endIndex: String.Index
    public let outputBlockCount: Int
}

public func parseBlockRange(_ markdown: String) -> [RootBlockRange]
This enables:
  • Mapping rendered content back to source positions
  • Syntax highlighting the active block in an editor
  • Efficient partial re-rendering

Usage example

Here’s a complete parsing example:
import MarkdownParser

let markdown = """
# Math Example

The quadratic formula is $x = \\frac{-b \\pm \\sqrt{b^2 - 4ac}}{2a}$.

```swift
let result = parser.parse(text)
  • Parse block math
  • Parse inline math """
let parser = MarkdownParser() let result = parser.parse(markdown) print(“Blocks: (result.document.count)”) print(“Math expressions: (result.mathContext.count)”) // Output: // Blocks: 4 (heading, paragraph, code block, task list) // Math expressions: 1 (the quadratic formula)

The parsed AST is now ready for rendering by the MarkdownView module.

Build docs developers (and LLMs) love