Parsing

MarkdownView uses swift-cmark — Apple’s Swift wrapper around the reference CommonMark implementation — to parse markdown into an Abstract Syntax Tree (AST). This ensures spec-compliant parsing with full GitHub Flavored Markdown (GFM) support.

Parser initialization

The MarkdownParser class creates a cmark parser with GFM extensions enabled:

Sources/MarkdownParser/MarkdownParser/MarkdownParser.swift

func withParser<T>(_ block: (UnsafeMutablePointer<cmark_parser>) -> T) -> T {
    let parser = cmark_parser_new(CMARK_OPT_DEFAULT)!
    cmark_gfm_core_extensions_ensure_registered()
    let extensionNames = [
        "autolink",
        "strikethrough",
        "tagfilter",
        "tasklist",
        "table",
    ]
    for extensionName in extensionNames {
        guard let syntaxExtension = cmark_find_syntax_extension(extensionName) else {
            assertionFailure()
            continue
        }
        cmark_parser_attach_syntax_extension(parser, syntaxExtension)
    }
    defer { cmark_parser_free(parser) }
    return block(parser)
}

The parser is created on-demand and freed after use to minimize memory footprint. This design allows parsing to be fully synchronous while keeping overhead low.

GFM extensions

MarkdownView enables five GitHub Flavored Markdown extensions:

autolink

Automatically converts URLs into clickable links without requiring angle brackets:

https://example.com becomes a link

strikethrough

Adds support for ~~deleted text~~ syntax:

~~This text is crossed out~~

tagfilter

Filters potentially dangerous HTML tags for security.

tasklist

Enables checkbox-style task lists:

- [x] Completed task
- [ ] Incomplete task

table

Parses pipe tables with column alignment:

| Left | Center | Right |
|:-----|:------:|------:|
| A    | B      | C     |

Basic parsing

The parse(_:) method is the primary entry point:

Sources/MarkdownParser/MarkdownParser/MarkdownParser.swift

public func parse(_ markdown: String) -> ParseResult {
    let math = MathContext(preprocessText: markdown)
    math.process()
    let markdown = math.indexedContent ?? markdown
    let nodes = withParser { parser in
        markdown.withCString { str in
            cmark_parser_feed(parser, str, strlen(str))
            return cmark_parser_finish(parser)
        }
    }
    var blocks = dumpBlocks(root: nodes)
    blocks = finalizeMathBlocks(blocks, mathContext: math)
    return .init(document: blocks, mathContext: math.contents)
}

The parsing pipeline has three stages:

Math preprocessing — Extract LaTeX expressions before cmark sees them
Swift-cmark parsing — Convert markdown to cmark AST
AST conversion — Transform cmark nodes into Swift types

Parse result

Sources/MarkdownParser/MarkdownParser/MarkdownParser.swift

public struct ParseResult {
    public let document: [MarkdownBlockNode]
    public let mathContext: [Int: String]
}

document: Array of top-level block nodes (headings, paragraphs, lists, etc.)
mathContext: Dictionary mapping replacement identifiers to original LaTeX strings

Math preprocessing

LaTeX expressions are extracted before cmark parsing to prevent markdown syntax inside math from being interpreted:

Sources/MarkdownParser/MarkdownParser/MarkdownParser+MathContext.swift

private let mathPattern: NSRegularExpression? = {
    let patterns = [
        ###"\$\$(\[\s\S]*?)\$\$"###, // Block: $$ ... $$
        ###"\\\\\[(\[\s\S]*?)\\\\\]"###, // Block: \\[ ... \\]
        ###"\\\\\((\[\s\S]*?)\\\\\)"###, // Inline: \\( ... \\)
        ###"\\\[ ([\s\S]*?) \\\]"###, // Block: \[ ... \]
        ###"\\\( ([^`\n]*?) \\\)"###, // Inline: \( ... \)
    ]
    let pattern = patterns.joined(separator: "|")
    return try? NSRegularExpression(pattern: pattern, options: [.caseInsensitive, .allowCommentsAndWhitespace])
}()

Preprocessing workflow

Find all math expressions

Use regex to match LaTeX delimiters in the input string.

Replace with identifiers

Substitute each math expression with a placeholder like `__MATH__0__`.

"The equation $E = mc^2$ is famous"
// becomes
"The equation `__MATH__0__` is famous"

Store original content

Save the LaTeX string in mathContext dictionary with the identifier as key.

mathContext[0] = "E = mc^2"

Parse modified markdown

Pass the placeholder-filled string to cmark. The backticks ensure math appears as inline code.

Restore math nodes

After parsing, convert inline code nodes with math identifiers back to .math() nodes.

See the full implementation in Sources/MarkdownParser/MarkdownParser/MarkdownParser+MathContext.swift:33.

This preprocessing approach ensures that markdown syntax characters inside LaTeX (like *, _, [, ]) don’t interfere with parsing.

Inline math detection

After the main parse, a second regex pass detects simple inline math that wasn’t caught in preprocessing:

Sources/MarkdownParser/MarkdownParser/MarkdownParser+MathContext.swift

private let mathPatternWithinBlock: NSRegularExpression? = {
    let patterns = [
        ###"\\\( ([^\r\n]+?) \\\)"###, // \(...\)
        ###"\$ ([^\r\n]+?) \$"###, // $ ... $
    ]
    let pattern = patterns.joined(separator: "|")
    return try? NSRegularExpression(pattern: pattern, options: [.caseInsensitive, .allowCommentsAndWhitespace])
}()

This catches inline expressions like $x^2$ and $ \alpha $ within text nodes.

AST conversion

The parser transforms cmark’s C-based AST into Swift enum types. Here’s how block nodes are extracted:

Sources/MarkdownParser/MarkdownParser/MarkdownParser+Node.swift

func dumpBlocks(root: UnsafeNode?) -> [MarkdownBlockNode] {
    guard let root else { return [] }
    var blocks: [MarkdownBlockNode] = []
    for node in root.children {
        guard let block = MarkdownBlockNode(unsafeNode: node) else { continue }
        blocks.append(block)
    }
    return blocks
}

Each cmark node type maps to a Swift enum case:

cmark type	Swift type
`CMARK_NODE_HEADING`	`.heading(level:content:)`
`CMARK_NODE_PARAGRAPH`	`.paragraph(content:)`
`CMARK_NODE_BLOCK_QUOTE`	`.blockquote(children:)`
`CMARK_NODE_CODE_BLOCK`	`.codeBlock(fenceInfo:content:)`
`CMARK_NODE_LIST`	`.bulletedList` / `.numberedList` / `.taskList`
Extension: table	`.table(columnAlignments:rows:)`

The AST conversion is recursive. Container blocks like blockquotes and list items contain child blocks, and paragraphs contain inline nodes.

Incremental parsing

For live editing and streaming scenarios, MarkdownView supports incremental parsing that reuses work from previous parses:

Sources/MarkdownParser/MarkdownParser/MarkdownParser.swift

public func parseIncremental(
    previousMarkdown: String,
    newMarkdown: String,
    previousBlocks: [MarkdownBlockNode],
    previousRanges: [RootBlockRange]? = nil
) -> IncrementalParseResult?

Incremental parsing strategy

Detect stable prefix

Compare previousMarkdown and newMarkdown to find the longest common prefix.

Reuse prefix blocks

Keep the AST nodes for the unchanged prefix without reparsing them.

Parse tail only

Parse the changed suffix starting from the stable boundary.

Merge results

Concatenate stable prefix blocks with newly parsed tail blocks.

let result = parser.parseIncremental(
    previousMarkdown: oldText,
    newMarkdown: newText,
    previousBlocks: oldBlocks
)

if let result = result {
    // Reuse first N blocks, append new tail
    let merged = oldBlocks.prefix(result.stablePrefixBlockCount) + result.tailResult.document
}

Incremental parsing provides a ~5ms speedup for 300-block documents when only the end is modified. See the performance benchmarks.

Block range tracking

The parser can report the source location of each root-level block:

Sources/MarkdownParser/MarkdownParser/MarkdownParser.swift

public struct RootBlockRange {
    public let type: MarkdownNodeType
    public let startIndex: String.Index
    public let endIndex: String.Index
    public let outputBlockCount: Int
}

public func parseBlockRange(_ markdown: String) -> [RootBlockRange]

This enables:

Mapping rendered content back to source positions
Syntax highlighting the active block in an editor
Efficient partial re-rendering

Usage example

Here’s a complete parsing example:

import MarkdownParser

let markdown = """
# Math Example

The quadratic formula is $x = \\frac{-b \\pm \\sqrt{b^2 - 4ac}}{2a}$.

```swift
let result = parser.parse(text)

Parse block math
Parse inline math """

let parser = MarkdownParser() let result = parser.parse(markdown) print(“Blocks: (result.document.count)”) print(“Math expressions: (result.mathContext.count)”) // Output: // Blocks: 4 (heading, paragraph, code block, task list) // Math expressions: 1 (the quadratic formula)


The parsed AST is now ready for rendering by the MarkdownView module.

Getting Started

Core Concepts

Guides

Advanced

Parser initialization

GFM extensions

Basic parsing

Parse result

Math preprocessing

Preprocessing workflow

Inline math detection

AST conversion

Incremental parsing

Incremental parsing strategy

Block range tracking

Usage example

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Guides

Advanced

​Parser initialization

​GFM extensions

​Basic parsing

​Parse result

​Math preprocessing

​Preprocessing workflow

​Inline math detection

​AST conversion

​Incremental parsing

​Incremental parsing strategy

​Block range tracking

​Usage example

Build docs developers (and LLMs) love

Parser initialization

GFM extensions

Basic parsing

Parse result

Math preprocessing

Preprocessing workflow

Inline math detection

AST conversion

Incremental parsing

Incremental parsing strategy

Block range tracking

Usage example