Vocabulary

Overview

The Game Grammar vocabulary consists of 74 tokens organized into 8 categories. Each token is assigned a unique integer ID from 0 to 73.

Token Categories

Structural Tokens (4 tokens)

Control sequence boundaries and timing.

BOS - Beginning of sequence (ID: 0)
EOS - End of sequence (ID: 1)
TICK - Marks a game tick boundary (ID: 2)
SNAP - Marks a state snapshot (ID: 3)

Entity Tokens (3 tokens)

Identify game entities in snapshots and events.

PLAYER - The snake entity (ID: 4)
FOOD - Food entity (ID: 5)
WALL - Wall boundary (ID: 6)

Direction Tokens (4 tokens)

Represent snake heading direction.

DIR_U - Direction up (ID: 7)
DIR_D - Direction down (ID: 8)
DIR_L - Direction left (ID: 9)
DIR_R - Direction right (ID: 10)

Input Tokens (4 tokens)

Represent player input actions.

INPUT_U - Input up (ID: 11)
INPUT_D - Input down (ID: 12)
INPUT_L - Input left (ID: 13)
INPUT_R - Input right (ID: 14)

Position X Tokens (10 tokens)

X-coordinates on the 10×10 grid.

X0 through X9 (IDs: 15-24)

Example:

VOCAB["X0"]  # 15
VOCAB["X5"]  # 20
VOCAB["X9"]  # 24

Position Y Tokens (10 tokens)

Y-coordinates on the 10×10 grid.

Y0 through Y9 (IDs: 25-34)

Example:

VOCAB["Y0"]  # 25
VOCAB["Y5"]  # 30
VOCAB["Y9"]  # 34

Event Type Tokens (7 tokens)

Game events and actions.

MOVE - Snake moves to new position (ID: 35)
EAT - Snake eats food (ID: 36)
GROW - Snake grows in length (ID: 37)
DIE_WALL - Snake hits wall (ID: 38)
DIE_SELF - Snake hits itself (ID: 39)
FOOD_SPAWN - Food spawns at new location (ID: 40)
SCORE - Score update marker (ID: 41)

Value Tokens (11 tokens)

Numerical values (scores, counts).

V0 through V10 (IDs: 42-52)
Used for scores 0-10 (capped at V10 for scores > 10)

Example:

VOCAB["V0"]   # 42 (score: 0)
VOCAB["V5"]   # 47 (score: 5)
VOCAB["V10"]  # 52 (score: 10+)

Length Tokens (21 tokens)

Snake body length.

LEN1 through LEN20 (IDs: 53-72)
LEN_LONG - For lengths > 20 (ID: 73)

Example:

VOCAB["LEN1"]     # 53 (initial length)
VOCAB["LEN10"]    # 62
VOCAB["LEN_LONG"] # 73 (length > 20)

Constants

VOCAB

VOCAB: dict[str, int]

Dictionary mapping token names (strings) to token IDs (integers). Usage:

from game_grammar.vocab import VOCAB

token_id = VOCAB["MOVE"]     # 35
token_id = VOCAB["X5"]       # 20
token_id = VOCAB["INPUT_R"]  # 14

ID_TO_TOKEN

ID_TO_TOKEN: dict[int, str]

Reverse mapping from token IDs (integers) to token names (strings). Usage:

from game_grammar.vocab import ID_TO_TOKEN

token_name = ID_TO_TOKEN[35]  # "MOVE"
token_name = ID_TO_TOKEN[20]  # "X5"
token_name = ID_TO_TOKEN[14]  # "INPUT_R"

VOCAB_SIZE

VOCAB_SIZE: int = 74

Total number of tokens in the vocabulary. Usage:

from game_grammar.vocab import VOCAB_SIZE

# Use for model configuration
embedding_layer = nn.Embedding(VOCAB_SIZE, embedding_dim)

Complete Token List

All 74 tokens in order by ID:

# IDs 0-3: Structural
["BOS", "EOS", "TICK", "SNAP"]

# IDs 4-6: Entity
["PLAYER", "FOOD", "WALL"]

# IDs 7-10: Direction
["DIR_U", "DIR_D", "DIR_L", "DIR_R"]

# IDs 11-14: Input
["INPUT_U", "INPUT_D", "INPUT_L", "INPUT_R"]

# IDs 15-24: Position X
["X0", "X1", "X2", "X3", "X4", "X5", "X6", "X7", "X8", "X9"]

# IDs 25-34: Position Y
["Y0", "Y1", "Y2", "Y3", "Y4", "Y5", "Y6", "Y7", "Y8", "Y9"]

# IDs 35-41: Event types
["MOVE", "EAT", "GROW", "DIE_WALL", "DIE_SELF", "FOOD_SPAWN", "SCORE"]

# IDs 42-52: Values
["V0", "V1", "V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10"]

# IDs 53-73: Lengths
["LEN1", "LEN2", "LEN3", ..., "LEN20", "LEN_LONG"]

Examples

Encoding a position

from game_grammar.vocab import VOCAB

x, y = 3, 7
x_token = VOCAB[f"X{x}"]  # 18
y_token = VOCAB[f"Y{y}"]  # 32

Decoding a token sequence

from game_grammar.vocab import ID_TO_TOKEN

tokens = [0, 2, 11, 35, 18, 32, 1]  # Token IDs
names = [ID_TO_TOKEN[t] for t in tokens]
# ["BOS", "TICK", "INPUT_U", "MOVE", "X3", "Y7", "EOS"]

Building event sequences

from game_grammar.vocab import VOCAB

# Encode: TICK INPUT_R MOVE X4 Y5
event_tokens = [
    VOCAB["TICK"],
    VOCAB["INPUT_R"],
    VOCAB["MOVE"],
    VOCAB["X4"],
    VOCAB["Y5"],
]
# [2, 14, 35, 19, 30]

Core

Game & Agents

Data Pipeline

Scripts

Overview

Token Categories

Structural Tokens (4 tokens)

Entity Tokens (3 tokens)

Direction Tokens (4 tokens)

Input Tokens (4 tokens)

Position X Tokens (10 tokens)

Position Y Tokens (10 tokens)

Event Type Tokens (7 tokens)

Value Tokens (11 tokens)

Length Tokens (21 tokens)

Constants

VOCAB

ID_TO_TOKEN

VOCAB_SIZE

Complete Token List

Examples

Encoding a position

Decoding a token sequence

Building event sequences

Build docs developers (and LLMs) love

Core

Game & Agents

Data Pipeline

Scripts

​Overview

​Token Categories

​Structural Tokens (4 tokens)

​Entity Tokens (3 tokens)

​Direction Tokens (4 tokens)

​Input Tokens (4 tokens)

​Position X Tokens (10 tokens)

​Position Y Tokens (10 tokens)

​Event Type Tokens (7 tokens)

​Value Tokens (11 tokens)

​Length Tokens (21 tokens)

​Constants

​VOCAB

​ID_TO_TOKEN

​VOCAB_SIZE

​Complete Token List

​Examples

​Encoding a position

​Decoding a token sequence

​Building event sequences

Build docs developers (and LLMs) love

Overview

Token Categories

Structural Tokens (4 tokens)

Entity Tokens (3 tokens)

Direction Tokens (4 tokens)

Input Tokens (4 tokens)

Position X Tokens (10 tokens)

Position Y Tokens (10 tokens)

Event Type Tokens (7 tokens)

Value Tokens (11 tokens)

Length Tokens (21 tokens)

Constants

VOCAB

ID_TO_TOKEN

VOCAB_SIZE

Complete Token List

Examples

Encoding a position

Decoding a token sequence

Building event sequences