Skip to main content
Full-moon is a lossless parser, meaning it preserves all formatting information including whitespace, comments, and style choices. This makes it ideal for building code formatters that can modify code structure while maintaining the original formatting intent.

Understanding Lossless Parsing

Unlike traditional AST parsers that discard formatting details, full-moon stores everything:
  • Whitespace: Spaces, tabs, and newlines are preserved in token references
  • Comments: Single-line and multi-line comments are accessible through trivia
  • Style choices: Parentheses, quote types, and other stylistic elements are maintained
This means you can parse code, make modifications, and print it back without losing the developer’s formatting preferences.
When you parse code with full-moon and immediately print it back using .to_string(), you get exactly the original code, byte-for-byte.

Example: Indentation Normalizer

This formatter normalizes indentation while preserving everything else.
use full_moon::{
    ast::{self, Ast},
    parse,
    tokenizer::{Token, TokenType, TokenReference},
    visitors::VisitorMut,
};

struct IndentationFormatter {
    indent_level: usize,
    indent_string: String,
}

impl IndentationFormatter {
    fn new(spaces: usize) -> Self {
        Self {
            indent_level: 0,
            indent_string: " ".repeat(spaces),
        }
    }

    fn current_indent(&self) -> String {
        self.indent_string.repeat(self.indent_level)
    }

    fn add_indent_to_token(&self, token: TokenReference) -> TokenReference {
        // Get leading whitespace and replace it
        let leading_trivia: Vec<Token> = token
            .leading_trivia()
            .map(|t| t.clone())
            .collect();
        
        let mut new_trivia = Vec::new();
        let mut found_newline = false;

        for trivia in leading_trivia {
            if let TokenType::Whitespace { characters } = trivia.token_type() {
                if characters.contains('\n') {
                    // Keep the newline, will add proper indent after
                    new_trivia.push(trivia);
                    found_newline = true;
                } else if !found_newline {
                    // Keep whitespace before newline
                    new_trivia.push(trivia);
                }
                // Skip whitespace after newline (will be replaced)
            } else {
                new_trivia.push(trivia);
            }
        }

        // Add proper indentation after last newline
        if found_newline {
            new_trivia.push(Token::new(TokenType::Whitespace {
                characters: self.current_indent().into(),
            }));
        }

        token.with_leading_trivia(new_trivia)
    }
}

impl VisitorMut for IndentationFormatter {
    fn visit_block(&mut self, block: ast::Block) -> ast::Block {
        self.indent_level += 1;
        
        // Process statements with proper indentation
        let stmts: Vec<(ast::Stmt, Option<TokenReference>)> = block
            .stmts_with_semicolon()
            .map(|(stmt, semi)| {
                let stmt = stmt.to_owned().visit_mut(self);
                (stmt, semi.cloned())
            })
            .collect();
        
        let last_stmt = block.last_stmt_with_semicolon()
            .map(|(stmt, semi)| {
                (stmt.to_owned().visit_mut(self), semi.cloned())
            });

        self.indent_level -= 1;
        
        block
            .with_stmts(stmts)
            .with_last_stmt(last_stmt)
    }

    fn visit_function_declaration(&mut self, func: ast::FunctionDeclaration) -> ast::FunctionDeclaration {
        let function_token = self.add_indent_to_token(func.function_token().clone());
        func.with_function_token(function_token)
            .visit_mut(self)
    }

    fn visit_local_assignment(&mut self, local: ast::LocalAssignment) -> ast::LocalAssignment {
        let local_token = self.add_indent_to_token(local.local_token().clone());
        local.with_local_token(local_token)
    }

    fn visit_if(&mut self, if_stmt: ast::If) -> ast::If {
        let if_token = self.add_indent_to_token(if_stmt.if_token().clone());
        if_stmt.with_if_token(if_token)
            .visit_mut(self)
    }
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let code = r#"
function calculate(x)
if x > 10 then
return x * 2
else
return x
end
end
    "#;

    println!("Original code:\n{}", code);
    
    let ast = parse(code)?;
    let mut formatter = IndentationFormatter::new(4);
    let formatted_ast = formatter.visit_ast(ast);
    
    println!("\nFormatted code:\n{}", formatted_ast.to_string());
    
    Ok(())
}

Example: String Quote Normalizer

This formatter converts all strings to use double quotes consistently.
use full_moon::{
    ast::Ast,
    parse,
    tokenizer::{Token, TokenType, TokenReference},
    visitors::VisitorMut,
};

struct StringQuoteFormatter;

impl VisitorMut for StringQuoteFormatter {
    fn visit_string_literal(&mut self, token: Token) -> Token {
        match token.token_type() {
            TokenType::StringLiteral { literal, .. } => {
                // Check if it's a single-quoted string
                if literal.starts_with('\'') && literal.ends_with('\'') {
                    // Convert to double quotes
                    let content = &literal[1..literal.len() - 1];
                    // Escape any double quotes in the content
                    let escaped = content.replace('\"', "\\\"")
                        .replace("\\'", "'"); // Unescape single quotes
                    
                    let new_literal = format!("\"{}\"", escaped);
                    
                    return Token::new(TokenType::StringLiteral {
                        literal: new_literal.into(),
                        multi_line: None,
                        quote_type: full_moon::tokenizer::StringLiteralQuoteType::Double,
                    });
                }
                token
            }
            _ => token,
        }
    }
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let code = r#"
        local name = 'John'
        local message = 'Hello, world!'
        local mixed = "Already double"
        print('Single quoted')
    "#;

    let ast = parse(code)?;
    let mut formatter = StringQuoteFormatter;
    let formatted = formatter.visit_ast(ast);
    
    println!("Formatted code:\n{}", formatted);
    
    Ok(())
}
Output:
local name = "John"
local message = "Hello, world!"
local mixed = "Already double"
print("Single quoted")

Example: Comprehensive Code Formatter

A more complete formatter that handles multiple styling rules.
use full_moon::{
    ast::{self, Ast, punctuated::Punctuated},
    parse,
    tokenizer::{Token, TokenType, TokenReference},
    visitors::VisitorMut,
};

struct CodeFormatter {
    config: FormatterConfig,
}

struct FormatterConfig {
    spaces_around_equals: bool,
    space_after_comma: bool,
    trailing_comma: bool,
}

impl Default for FormatterConfig {
    fn default() -> Self {
        Self {
            spaces_around_equals: true,
            space_after_comma: true,
            trailing_comma: false,
        }
    }
}

impl CodeFormatter {
    fn new(config: FormatterConfig) -> Self {
        Self { config }
    }

    fn create_token(&self, symbol: &str, trailing_space: bool) -> TokenReference {
        let with_space = if trailing_space {
            format!("{} ", symbol)
        } else {
            symbol.to_string()
        };
        TokenReference::basic_symbol(&with_space)
    }
}

impl VisitorMut for CodeFormatter {
    fn visit_local_assignment(&mut self, local: ast::LocalAssignment) -> ast::LocalAssignment {
        let local = local;
        
        // Format the equal token if it exists
        if let Some(equal_token) = local.equal_token() {
            let new_equal = if self.config.spaces_around_equals {
                TokenReference::basic_symbol(" = ")
            } else {
                TokenReference::basic_symbol("=")
            };
            local.with_equal_token(Some(new_equal))
        } else {
            local
        }
    }

    fn visit_assignment(&mut self, assignment: ast::Assignment) -> ast::Assignment {
        let equal_token = if self.config.spaces_around_equals {
            TokenReference::basic_symbol(" = ")
        } else {
            TokenReference::basic_symbol("=")
        };
        
        assignment.with_equal_token(equal_token)
    }
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let code = r#"
        local x=1
        local y,z=2,3
        name="Alice"
    "#;

    println!("Before formatting:\n{}", code);
    
    let ast = parse(code)?;
    let mut formatter = CodeFormatter::new(FormatterConfig::default());
    let formatted = formatter.visit_ast(ast);
    
    println!("\nAfter formatting:\n{}", formatted);
    
    Ok(())
}
Output:
local x = 1
local y, z = 2, 3
name = "Alice"

Working with Trivia

Trivia (whitespace and comments) is stored in token references. Here’s how to work with it:
use full_moon::tokenizer::{Token, TokenType, TokenReference};

// Reading trivia
for token in node.tokens() {
    // Leading trivia (whitespace/comments before the token)
    for trivia in token.leading_trivia() {
        match trivia.token_type() {
            TokenType::Whitespace { characters } => {
                println!("Whitespace: {:?}", characters);
            }
            TokenType::SingleLineComment { comment } => {
                println!("Comment: {}", comment);
            }
            _ => {}
        }
    }
    
    // Trailing trivia (whitespace/comments after the token)
    for trivia in token.trailing_trivia() {
        // Process trailing trivia
    }
}

// Modifying trivia
let new_token = token.with_leading_trivia(vec![
    Token::new(TokenType::Whitespace {
        characters: "    ".into(),
    }),
    Token::new(TokenType::SingleLineComment {
        comment: "-- formatted".into(),
    }),
]);

Key Concepts

1

Use VisitorMut for modifications

Unlike Visitor, the VisitorMut trait returns modified nodes, allowing you to transform the AST.
2

Preserve what you don't change

Only modify the specific tokens/nodes you want to format. Everything else remains unchanged.
3

Work with trivia carefully

Comments and whitespace are in trivia. Use with_leading_trivia() and with_trailing_trivia() to modify them.
4

Test round-trip formatting

Parse, format, print, parse again to ensure your formatter produces valid Lua.

Best Practices

  • Preserve comments: Never discard comments unless explicitly intended
  • Maintain blank lines: Respect the developer’s use of blank lines for code organization
  • Handle edge cases: Consider multi-line strings, nested structures, and edge whitespace
  • Make it configurable: Use a config struct to allow users to customize formatting rules
  • Test thoroughly: Format real-world code and verify the output is syntactically correct
Performance tip: The VisitorMut pattern creates new nodes as it traverses. For large files, consider whether you need to format everything or just specific sections.

Next Steps

Static Analysis

Analyze code without modification using the Visitor pattern

AST Transformation

Learn advanced transformation techniques

Build docs developers (and LLMs) love