Skip to main content

Overview

MarkdownView uses tree-sitter for native, high-performance syntax highlighting. Unlike JavaScript-based solutions like highlight.js, tree-sitter runs natively in C with zero JavaScriptCore overhead, producing semantic color ranges directly from parse trees.
Language parsers are initialized lazily — only the languages actually used in your markdown are loaded, keeping memory footprint minimal.

Supported Languages

MarkdownView supports 19 programming languages with full syntax highlighting:

Swift

swift

Python

python, py, python3

JavaScript

javascript, js, jsx

TypeScript

typescript, ts

TSX

tsx

Go

go, golang

Rust

rust, rs

C

c, h

C++

cpp, c++, cc, cxx, hpp

C#

csharp, c#, cs

Java

java

Kotlin

kotlin, kt, kts

Ruby

ruby, rb

Bash

bash, sh, shell, zsh

SQL

sql

YAML

yaml, yml

JSON

json, jsonc

HTML

html, htm

CSS

css

How It Works

Tree-sitter parsing happens in several stages:
1

Parse Request

When markdown contains a code block with a language identifier, MarkdownView generates a cache key:
let key = CodeHighlighter.current.key(for: code, language: "python")
2

Check Cache

The highlighter checks if this exact code + language combination has been highlighted before:
if let cached = renderCache.object(forKey: NSNumber(value: key)) {
    return cached.value
}
3

Lazy Language Loading

If not cached, the language configuration is loaded on-demand:
guard let config = Self.languageConfiguration(for: "python") else { 
    return [:] 
}
Only the Python parser is initialized — Swift, JavaScript, etc. remain unloaded.
4

Parse with Tree-Sitter

The code is parsed into an AST using tree-sitter’s C parser:
let parser = Parser()
try parser.setLanguage(config.language)
guard let tree = parser.parse(content) else { return [:] }
5

Extract Semantic Tokens

The highlights query extracts semantic tokens (keywords, strings, functions, etc.):
let cursor = query.execute(in: tree)
let highlights = cursor.resolve(with: context).highlights()
6

Map to Colors

Each semantic token is mapped to a color based on its type:
for highlight in highlights {
    if let color = Self.color(forCapture: highlight.name) {
        map[highlight.range] = color
    }
}
7

Cache and Return

The result is cached and returned as a map of ranges to colors:
renderCache.setObject(HighlightMapBox(map), forKey: nsKey)
return map

Performance

Tree-sitter’s native implementation provides exceptional performance:
BenchmarkTime
Highlight 50 lines~2 ms
Highlight 500 lines~21 ms
Parse 500 blocks~5 ms
Parse + preprocess 300 blocks~3 ms
The plain-text streaming fast path allows the view to skip reparsing for safe token appends that don’t introduce new markdown syntax, resulting in <0.1ms updates.

Language Aliases

Many languages have multiple recognized aliases:
All of these work for Python:
```python
```py
```python3

All of these work for JavaScript:
```javascript
```js
```jsx

All of these work for Bash:
```bash
```sh
```shell
```zsh
The language identifier is case-insensitive:
```Swift  ← works
```SWIFT   works
```swift   works

Color Scheme

MarkdownView uses a carefully designed color scheme with automatic dark mode support:

Semantic Token Types

Keywords and control flow:
if, else, return, func, class, struct, enum
  • Light mode: Purple rgb(170, 13, 145)
  • Dark mode: Pink rgb(252, 95, 165)

Dark Mode Support

Colors automatically adapt using dynamic UIColors/NSColors:
let keyword = dynamicColor(
    light: PlatformColor(red: 0.667, green: 0.051, blue: 0.569, alpha: 1),
    dark: PlatformColor(red: 0.988, green: 0.373, blue: 0.647, alpha: 1)
)
The colors switch instantly when the system appearance changes, with no manual intervention required.

Advanced Usage

Manual Highlighting

You can use the highlighter directly without markdown parsing:
let highlighter = CodeHighlighter.current

let code = """
func greet(name: String) {
    print("Hello, \(name)!")
}
"""

// Generate cache key
let key = highlighter.key(for: code, language: "swift")

// Highlight the code
let colorMap = highlighter.highlight(
    key: key,
    content: code,
    language: "swift"
)

// Apply colors to attributed string
let theme = MarkdownTheme.default
let attributedString = colorMap.apply(to: code, with: theme)

Cache Behavior

The highlighter maintains an LRU cache with a 256-entry limit:
private var renderCache: NSCache<NSNumber, HighlightMapBox> = {
    let cache = NSCache<NSNumber, HighlightMapBox>()
    cache.countLimit = 256
    return cache
}()
Cache keys are generated from content and language:
func key(for content: String, language: String?) -> Int {
    var hasher = Hasher()
    hasher.combine(content)
    hasher.combine(language?.lowercased() ?? "")
    return hasher.finalize()
}

Unsupported Languages

When a language is not supported, highlighting gracefully falls back to plain text:
```brainfuck
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.

The code block renders with default code styling but no syntax highlighting.

## Tree-Sitter Integration Details

### Package Dependencies

MarkdownView includes tree-sitter parsers as Swift Package Manager dependencies:

```swift
dependencies: [
    .package(url: "https://github.com/tree-sitter/swift-tree-sitter", from: "0.9.0"),
    .package(url: "https://github.com/tree-sitter/tree-sitter-python", "0.23.0"..<"0.25.0"),
    // ... 17 more language packages
]

Language Configuration

Each language has a configuration that includes:
  1. Parser: The C function that creates the tree-sitter language (e.g., tree_sitter_swift())
  2. Queries: Semantic query files that define which AST nodes map to which token types
  3. Bundle: Resource bundle containing the query files
private static func makeConfig(
    _ tsLanguage: OpaquePointer,
    name: String,
    bundleName: String? = nil,
    queriesSubpath: String? = nil
) throws -> LanguageConfiguration

Query Files

Tree-sitter uses query files (usually highlights.scm) to map AST nodes to semantic tokens:
; Example from Swift highlights query
(keyword) @keyword
(string_literal) @string
(comment) @comment
(call_expression function: (identifier) @function)
These queries are compiled and executed against the parse tree to extract colored ranges.

Customizing Syntax Colors

Currently, syntax colors are hardcoded in CodeHighlighter. Custom color themes for syntax highlighting are not yet exposed in the public API.
The color mapping is defined in the captureColorMap dictionary:
private static let captureColorMap: [String: PlatformColor] = [
    "keyword": keyword,
    "string": string,
    "comment": comment,
    "type": type,
    "function": function,
    "number": number,
    "property": property,
    // ... more mappings
]
Future versions may expose this as part of MarkdownTheme for full customization.

Best Practices

Use Language Aliases

Provide language identifiers in your markdown code blocks for proper highlighting:
```swift  ← ✅ Highlighted
```         Plain text

Let It Cache

The highlighter caches results automatically. Don’t worry about performance for repeated code blocks.

Lazy Loading

Only use languages you need. Unused parsers never load, keeping memory usage low.

Trust the Parser

Tree-sitter parsers are error-tolerant and handle malformed code gracefully without crashing.

Limitations

  • No custom syntax colors yet: The color scheme is built-in and cannot be customized through MarkdownTheme
  • 19 languages only: Other languages fall back to plain text (no highlighting)
  • No inline language mixing: Each code block can only use one language

Future Enhancements

Potential future improvements:
  • Expose syntax colors in MarkdownTheme for full customization
  • Add more languages (PHP, Perl, Lua, etc.)
  • Support for custom tree-sitter parsers
  • Line-level language switching for polyglot examples

Build docs developers (and LLMs) love