AI/ML Research Project

Learning game rules from event sequences

A causal transformer trained on gameplay traces learns the grammar of videogames — physics, rules, and player behaviors — through next-token prediction on event streams. Built in pure Python with zero dependencies.

Get Started Learn the Theory

Sample token sequence

Quick Start

Get up and running with Game Grammar in three steps

Generate gameplay episodes

Run the agent mix to produce tokenized gameplay traces from Random, Greedy, and WallFollower agents.

python scripts/generate.py

This creates episodes.json with 200 tokenized gameplay sequences.

Train the transformer

Train a 2-layer, 32-dim, 4-head causal transformer on the generated episodes.

python scripts/train.py

The model learns game grammar through next-token prediction. Training output shows loss decreasing from ~4.47 to ~0.25 over 5000 steps.

The model uses a custom autograd implementation in pure Python — no PyTorch, TensorFlow, or external dependencies.

Sample and validate

Generate novel gameplay sequences and validate them against physical and rule constraints.

python scripts/sample.py

The validation system checks three tiers: structural validity (BOS/EOS markers), physical validity (adjacent moves, in-bounds positions), and rule validity (EAT→GROW+FOOD_SPAWN, DIE→EOS).

Core Concepts

Understand the theoretical foundation and architecture

Wittgensteinian Theory

Events define meaning through use. Collision-defined semantics and grammar as the structure of what can follow what.

Event Streams

Game-agnostic interface protocol with salience levels, entity identity, and tick bundling.

Tokenization

Hybrid snapshot+delta encoding using a 74-token vocabulary for Snake. Five approaches analyzed.

Transformer Architecture

31K parameters, 2 layers, 64-token context window. Custom autograd in pure Python.

Key Features

What makes Game Grammar unique

Zero Dependencies

Built entirely in pure Python with custom autograd. No PyTorch, TensorFlow, or external frameworks required.

Game-Agnostic

Event stream abstraction works across any game. Tokenization layer handles game-specific encoding.

Three-Tier Validation

Validates structural correctness, physical plausibility, and rule consistency of generated sequences.

Learned Archetypes

Player behaviors emerge as statistical regularities without explicit labels or supervision.

Explore the API

Deep dive into the implementation

GameGPT Model

Causal transformer with custom autograd

EventCodec

Hybrid tokenization encoder/decoder

Snake Game

Event-stream Snake implementation

Agent Types

Random, Greedy, WallFollower agents

Data Pipeline

Episode collection and tokenization

Validation

Three-tier validity checking

Ready to explore?

Start training your own transformer or dive into the theoretical foundation

Get Started View on GitHub

Overview

Concepts

Training

Games

Learning game rules from event sequences

Quick Start

Core Concepts

Wittgensteinian Theory

Event Streams

Tokenization

Transformer Architecture

Key Features

Zero Dependencies

Game-Agnostic

Three-Tier Validation

Learned Archetypes

Explore the API

GameGPT Model

EventCodec

Snake Game

Agent Types

Data Pipeline

Validation

Ready to explore?