What is Lossless Parsing?
A lossless parser ensures that:- Perfect Round-Trip: You can parse code to an AST and convert it back to get the exact original code
- Style Preservation: Your formatting choices, indentation, and spacing are maintained
- Comment Retention: All comments (single-line and multi-line) are kept with proper positioning
- Token Fidelity: Every character in your source code is accounted for in the parsed representation
With Full Moon, you can modify the AST directly and re-export it back to Lua, all while preserving the style in which you write.
Why Lossless Parsing Matters
Lossless parsing is essential for tools that need to:- Format code automatically (like rustfmt or prettier)
- Refactor code en masse (like jscodeshift)
- Provide accurate LSP features (go-to-definition, hover information)
- Perform static analysis without losing context (like Luacheck)
- Add or modify code while respecting existing style
How It Works
Full Moon achieves lossless parsing through a two-layer approach:1. Tokenization Layer
The tokenizer converts source code into tokens while preserving trivia:- Whitespace (spaces, tabs, newlines)
- Single-line comments (
-- comment) - Multi-line comments (
--[[ comment ]])
2. AST Layer
The AST usesTokenReference objects instead of plain strings, ensuring every node carries its formatting information.
Example: Preserving Style
Consider this Lua code with specific formatting:- The exact spacing around
= - The comment and its position
- The multiple spaces in the second line
Parsing with Full Moon
src/lib.rs:79-94):
Trivia Types
Full Moon tracks these trivia types (src/tokenizer/structs.rs:242-323):
Position Tracking
Every token includes precise position information (src/tokenizer/structs.rs:852-875):
- Report error locations accurately
- Implement “go to definition” features
- Create precise source maps
- Track changes across edits
Full Moon is heavily inspired by benjamn’s recast for JavaScript, which pioneered the lossless parsing approach for code transformation tools.
Use Cases
Lossless parsing enables these powerful applications:- Code Formatters: Reformat code while preserving intentional style choices
- Refactoring Tools: Rename variables or restructure code without breaking formatting
- Static Analysis: Analyze code and report issues with exact locations
- Code Generation: Insert generated code that matches surrounding style
- Documentation Tools: Extract comments and associate them with code elements
Comparison: Lossy vs Lossless
| Aspect | Lossy Parser | Full Moon (Lossless) |
|---|---|---|
| Comments | Lost | Preserved |
| Whitespace | Normalized | Exact |
| Round-trip | Impossible | Perfect |
| Style | Standardized | Maintained |
| Use for formatting | No | Yes |
| Use for analysis | Limited | Full |
Next Steps
AST Structure
Learn about the Abstract Syntax Tree structure
Tokenization
Deep dive into the tokenization process