Skip to main content
The Position type provides precise location information for tokens in source code, tracking bytes, lines, and character positions.

Position

Used to represent exact positions of tokens in code.
pub struct Position {
    pub(crate) bytes: usize,
    pub(crate) line: usize,
    pub(crate) character: usize,
}

Methods

bytes

How many bytes, ignoring lines, it would take to find this position.
pub fn bytes(self) -> usize

character

Index of the character on the line for this position.
pub fn character(self) -> usize

line

Line the position lies on.
pub fn line(self) -> usize

Usage Example

Positions are automatically tracked by the tokenizer and attached to every token:
use full_moon::tokenizer::{Lexer, TokenType};
use full_moon::LuaVersion;

let source = "local x = 5";
let lexer = Lexer::new(source, LuaVersion::lua51());
let tokens = lexer.collect().unwrap();

for token in tokens {
    let start = token.start_position();
    let end = token.end_position();
    
    println!(
        "Token at line {}, characters {}-{}: {:?}",
        start.line(),
        start.character(),
        end.character(),
        token.token_type()
    );
}

Position Tracking

Positions are tracked in three ways:
  1. Bytes: Total byte offset from the start of the source code
  2. Line: Line number (1-indexed)
  3. Character: Character position within the line (1-indexed)
This comprehensive tracking allows for:
  • Accurate error reporting
  • Source code navigation
  • Precise token location for IDE features
  • Byte-level operations on UTF-8 source code

Comparison and Ordering

Positions implement Ord and are ordered by their byte position:
use full_moon::tokenizer::Position;

let pos1 = Position { bytes: 0, line: 1, character: 1 };
let pos2 = Position { bytes: 5, line: 1, character: 6 };

assert!(pos1 < pos2);

Default Position

Position implements Default, creating a position at the start of the source:
use full_moon::tokenizer::Position;

let default_pos = Position::default();
assert_eq!(default_pos.bytes(), 0);
assert_eq!(default_pos.line(), 0);
assert_eq!(default_pos.character(), 0);

Usage with Tokens

Every token has both a start and end position:
use full_moon::tokenizer::{Token, TokenType};
use full_moon::ShortString;

let token = Token::new(TokenType::Identifier {
    identifier: ShortString::new("hello"),
});

let start = token.start_position();
let end = token.end_position();

// Positions can be used to extract source text or report errors
println!("Token spans from byte {} to byte {}", start.bytes(), end.bytes());
println!("Located at line {}, characters {}-{}", 
    start.line(), 
    start.character(), 
    end.character()
);

Multi-byte Characters

The tokenizer correctly handles UTF-8 multi-byte characters:
  • Bytes increment by the actual UTF-8 byte length of each character
  • Character increments by 1 for each Unicode character (regardless of byte length)
  • Line increments on newline characters
This ensures accurate position tracking even with Unicode source code:
use full_moon::tokenizer::Lexer;
use full_moon::LuaVersion;

let source = "local 变量 = 5"; // Contains multi-byte Chinese characters
let lexer = Lexer::new(source, LuaVersion::lua51());
let tokens = lexer.collect().unwrap();

// Positions will correctly account for multi-byte UTF-8 characters

Build docs developers (and LLMs) love