Overview
TheLexer class performs lexical analysis on AXON source code, converting raw text into a sequence of typed tokens. This is the first phase of the compilation pipeline.
Class: Lexer
Constructor
The AXON source code to tokenize
The source file name for error messages
Methods
tokenize() -> list[Token]
Scan the entire source and return all tokens.
Returns: List of Token objects, ending with an EOF token
Raises: AxonLexerError if invalid syntax is encountered
Token Types
The lexer recognizes the following token types:Keywords
| Token | Example |
|---|---|
PERSONA | persona |
CONTEXT | context |
ANCHOR | anchor |
FLOW | flow |
STEP | step |
REASON | reason |
VALIDATE | validate |
MEMORY | memory |
TOOL | tool |
RUN | run |
Literals
| Token | Example | Description |
|---|---|---|
STRING | "Hello world" | Double-quoted strings with escape sequences |
INTEGER | 42 | Integer literals |
FLOAT | 3.14 | Floating-point literals |
DURATION | 30s, 5m, 1h | Duration literals with units |
BOOL | true, false | Boolean literals |
Operators & Punctuation
| Token | Symbol | Description |
|---|---|---|
ARROW | -> | Arrow operator |
DOTDOT | .. | Range operator |
EQ | == | Equality comparison |
NEQ | != | Not equal comparison |
LT | < | Less than |
GT | > | Greater than |
LTE | <= | Less than or equal |
GTE | >= | Greater than or equal |
LBRACE | { | Left brace |
RBRACE | } | Right brace |
LPAREN | ( | Left parenthesis |
RPAREN | ) | Right parenthesis |
LBRACKET | [ | Left bracket |
RBRACKET | ] | Right bracket |
COLON | : | Colon |
COMMA | , | Comma |
DOT | . | Dot |
QUESTION | ? | Question mark (optional type) |
Token Structure
Features
String Escape Sequences
The lexer supports standard escape sequences in string literals:\n- newline\t- tab\\- backslash\"- double quote
Comment Stripping
Both line comments (//) and block comments (/* */) are automatically removed:
Duration Literals
Duration values with time units are recognized as single tokens:s (seconds), ms (milliseconds), m (minutes), h (hours), d (days)
Error Handling
AxonLexerError
Raised when invalid syntax is encountered:message: str- Human-readable error descriptionline: int- Line number where error occurredcolumn: int- Column number where error occurred
Common Errors
Example: Full Tokenization
Performance
The lexer is a single-pass scanner with O(n) time complexity where n is the length of the source code. It’s suitable for files up to several MB in size.Next Steps
Parser API
Learn how to parse tokens into an abstract syntax tree
