Skip to main content
This section documents the complete reverse engineering process used to achieve behavioral parity between the original Crimsonland 1.9.93 binary and the Python reimplementation.

Approach

The reverse engineering effort combines three complementary methodologies:
  1. Static Analysis - Ghidra-based decompilation and symbol recovery
  2. Runtime Validation - Frida and WinDbg instrumentation for ground truth
  3. Differential Testing - Automated parity verification between original and rewrite

Project Goals

The aim is behavioral parity: timings, RNG sequences, float32 math, UI layout quirks, asset decoding, and gameplay rules must match the original as closely as practical. This includes:
  • Deterministic simulation with float32 precision contracts
  • Exact asset format decoding (PAQ archives, JAZ textures)
  • Pixel-perfect UI rendering and layout
  • Matching weapon damage, projectile physics, and creature AI
  • Compatible save file and config formats

Target Binary

crimsonland.exe

Build: 2011-02-01 07:13:37 UTC
Compiler: Visual Studio 2003 (VC++ 7.1 SP1)
Base: 0x00400000 (fixed)
Size: ~500KB code + 378KB BSS

grim.dll

Build: 2011-02-01 07:25:24 UTC
Compiler: Visual Studio 2003 (VC++ 7.1 SP1)
Base: 0x10000000 (relocatable)
Export: GRIM__GetInterface (Grim2D vtable)

Key Challenges

No Debug Symbols

  • PDB stripped, no RTTI
  • All function and variable names must be inferred from usage
  • Class structures reconstructed from vtable analysis

Float32 Precision

Deterministic gameplay requires matching the original’s float32 math exactly:
  • x87 FPU instruction sequences
  • Rounding behavior in damage calculations
  • Accumulation order in physics updates
See Float Parity Policy for implementation details.

Custom Asset Formats

Proprietary formats with no public documentation:
  • PAQ: Simple concatenated archive format
  • JAZ: JPEG + RLE alpha texture compression
  • Binary config and save files with obfuscation

Analysis Artifacts

All reverse engineering artifacts live in analysis/:
analysis/
├── ghidra/
│   ├── maps/              # Source of truth: name_map.json, data_map.json
│   ├── raw/               # Decompiled C output from Ghidra
│   └── derived/hotspots/  # Focused function extractions
├── frida/
│   ├── raw/              # Runtime capture logs (JSONL)
│   └── facts.jsonl       # Normalized evidence
└── windbg/
    └── sessions/         # Debugger capture notes

Workflow Summary

1

Static Decompilation

Use Ghidra to decompile binaries and identify function boundaries, data structures, and call graphs.
2

Symbol Recovery

Manually name functions and globals based on usage patterns. Store names in analysis/ghidra/maps/name_map.json.
3

Runtime Capture

Use Frida to hook key functions and capture actual runtime values (RNG seeds, damage calculations, state snapshots).
4

Struct Mapping

Cross-reference static analysis with runtime traces to recover complete struct layouts (player, creature, projectile pools).
5

Differential Testing

Run original and rewrite side-by-side with identical inputs, compare state at each tick to verify parity.
6

Iteration

Fix divergences, capture new evidence, update maps, regenerate decompiles.

Documentation Structure

Methodology

Step-by-step reverse engineering process

Static Analysis

Ghidra decompilation and symbol recovery

Runtime Tools

Frida and WinDbg instrumentation

File Formats

PAQ, JAZ, config, and save formats

Data Structures

Recovered struct layouts and pools
The reverse engineering process prioritized measured parity over “looks right”. Every claim is backed by runtime captures, decompiled code, or differential test results.

Build docs developers (and LLMs) love