replay benchmark

Measure replay throughput (ticks per second) with optional profiling and render telemetry.

Usage

crimson replay benchmark <replay_file> [OPTIONS]

Arguments

replay_file

Path

required

Replay file path (.crd).If a filename is provided without path, also searches base-dir/replays/

Options

--runs

int

Number of measured benchmark runs.Default:

headless: 5
render: 1

Must be ≥ 1.

--warmup-runs

int

Warmup runs before measured timing.Default:

headless: 1
render: 0

Must be ≥ 0.

--mode

headless|render

default:"headless"

Benchmark mode:

headless — Simulation only (no rendering)
render — Full rendering pipeline

--rtx

flag

Enable non-canonical RTX render mode.Render mode only.

--max-ticks

int

Stop after N ticks.Default: Full replay

--trace-rng

flag

Enable replay RNG trace mode during simulation.

--profile

flag

Run one cProfile pass and include hotspot summary.

--profile-sort

cumtime|tottime

default:"cumtime"

Hotspot sort key:

cumtime — Cumulative time (includes subcalls)
tottime — Total time (function only)

--top

int

default:"20"

Maximum hotspot rows to include.Must be ≥ 1.

--profile-out

Path

Optional cProfile .pstats output path.Used only with --profile.

--render-telemetry

flag

Collect per-tick render telemetry.Render mode only. Records frame timing, draw calls, pass breakdown.

--render-telemetry-out

Path

Optional output path for full render telemetry JSON.Render mode only.

--render-charts-out-dir

Path

Optional output directory for render telemetry SVG charts.Render mode only. Requires altair package.

--format

human|json

default:"human"

Output format:

human — Human-readable text
json — Machine-readable JSON

--json-out

Path

Optional JSON output path for benchmark payload.

--base-dir

Path

Base path for runtime files.Default: Per-user OS data directory

--runtime-dir

Path

Alias for --base-dir.

Output Format

Human Format (default)

crimson replay benchmark run.crd

Output:

ok: mode=headless runs=5 warmup_runs=1 ticks=36000 wall_ms_p50=1842.345 tps_p50=19542.18 realtime_x_p50=325.70
wall_ms min=1821.234 p50=1842.345 mean=1848.123 p95=1876.543 max=1891.234 stdev=24.567
throughput_tps min=19012.34 p50=19542.18 mean=19478.92 p95=19876.54 max=19998.21 stdev=321.45 | realtime_x min=316.87 p50=325.70 mean=324.65 p95=331.28 max=333.30 stdev=5.36

JSON Format

crimson replay benchmark run.crd --format json > bench.json

See schema in _ReplayBenchmarkPayload (source:cli/replay.py:389).

Metrics

wall_ms

float

Wall-clock milliseconds per run.

ticks_per_second

float

Simulation throughput (ticks/sec).Higher = better performance.Target: ~60 tps = 1x realtime

realtime_x

float

Realtime multiplier.

1.0x = realtime
10.0x = 10x faster than realtime
0.5x = 2x slower than realtime

All metrics include: min, p50, mean, p95, max, stdev.

Examples

Basic Headless Benchmark

crimson replay benchmark run.crd
# 5 runs + 1 warmup

Single Run

crimson replay benchmark run.crd --runs 1 --warmup-runs 0

Render Mode Benchmark

crimson replay benchmark run.crd --mode render
# 1 run + 0 warmup (slower)

With Profiling

crimson replay benchmark run.crd --profile --profile-out profile.pstats

Output includes hotspot table:

profile: sort=cumtime source=cProfile top=20
hotspots:
  01 cum=1.234567s tot=0.123456s calls=36000/36000 src/crimson/sim/tick.py:45::update_tick
  02 cum=0.987654s tot=0.456789s calls=72000/72000 src/crimson/entities/creature.py:123::update
  ...

Render Telemetry

crimson replay benchmark run.crd \
  --mode render \
  --render-telemetry \
  --render-telemetry-out telemetry.json \
  --render-charts-out-dir charts/

Generates:

telemetry.json — Per-frame timing data
charts/frame_timing.svg — Frame timing chart
charts/draw_calls.svg — Draw call chart
charts/pass_timing_stacked.svg — Render pass breakdown
charts/report.md — Summary report

JSON Output

crimson replay benchmark run.crd --format json --json-out result.json

Short Replay Segment

crimson replay benchmark run.crd --max-ticks 6000
# Benchmark first 100 seconds only

Profiling

With --profile, runs one extra profiled pass using Python’s cProfile:

crimson replay benchmark run.crd --profile --top 10

Output shows top hotspots:

hotspots:
  01 cum=1.23s tot=0.12s calls=36000/36000 sim/tick.py:45::update_tick
  02 cum=0.98s tot=0.45s calls=72000/72000 entities/creature.py:123::update
  03 cum=0.76s tot=0.23s calls=36000/36000 weapons/handler.py:67::fire
  ...

Export full pstats:

crimson replay benchmark run.crd --profile --profile-out bench.pstats

# Analyze with pstats
python -m pstats bench.pstats

Render Telemetry

With --render-telemetry, collects per-frame metrics:

frame_ms

float

Total frame time (update + draw).

update_ms

float

Simulation update time.

draw_ms

float

Rendering time.

draw_calls_total

int

Total draw calls per frame.

draw_calls_by_api

dict

Draw calls grouped by raylib API.

draw_calls_by_pass

dict

Draw calls grouped by render pass.

pass_ms

dict

Time spent in each render pass.

Charts

With --render-charts-out-dir, generates SVG visualizations:

crimson replay benchmark run.crd \
  --mode render \
  --render-telemetry \
  --render-charts-out-dir ./charts

Outputs:

frame_timing.svg — Frame time over ticks
draw_calls.svg — Draw call count over ticks
pass_timing_stacked.svg — Stacked pass timing
report.md — Markdown summary

Performance Tips

Headless vs Render

Headless is 10-20x faster:

# Fast: 20000 tps
crimson replay benchmark run.crd --mode headless

# Slow: 60-120 tps (limited by vsync)
crimson replay benchmark run.crd --mode render

Reducing Variance

For stable benchmarks:

Close background apps
Use --runs 10 for better statistics
Use --warmup-runs 2 to warm caches
Run on AC power (laptops)

Short Replays

For quick iteration:

crimson replay benchmark run.crd --max-ticks 3600  # 1 minute

Use Cases

Regression Testing

Detect performance regressions:

# Baseline
crimson replay benchmark test.crd --format json > baseline.json

# After changes
crimson replay benchmark test.crd --format json > current.json

# Compare
python scripts/compare_benchmarks.py baseline.json current.json

Profiling Bottlenecks

crimson replay benchmark slow.crd --profile --top 30

Render Optimization

crimson replay benchmark complex.crd \
  --mode render \
  --render-telemetry \
  --render-charts-out-dir optimization/

Analyze charts to identify:

Frame spikes
Draw call bottlenecks
Expensive render passes

CI Performance Tracking

# .github/workflows/benchmark.yml
- name: Benchmark replays
  run: |
    crimson replay benchmark tests/bench.crd \
      --format json \
      --json-out bench-${{ github.sha }}.json

Commands

Replay System

Debug Views

Usage

Arguments

Options

Output Format

Human Format (default)

JSON Format

Metrics

Examples

Basic Headless Benchmark

Single Run

Render Mode Benchmark

With Profiling

Render Telemetry

JSON Output

Short Replay Segment

Profiling

Render Telemetry

Charts

Performance Tips

Headless vs Render

Reducing Variance

Short Replays

Use Cases

Regression Testing

Profiling Bottlenecks

Render Optimization

CI Performance Tracking

See Also

Build docs developers (and LLMs) love

Commands

Replay System

Debug Views

​Usage

​Arguments

​Options

​Output Format

​Human Format (default)

​JSON Format

​Metrics

​Examples

​Basic Headless Benchmark

​Single Run

​Render Mode Benchmark

​With Profiling

​Render Telemetry

​JSON Output

​Short Replay Segment

​Profiling

​Render Telemetry

​Charts

​Performance Tips

​Headless vs Render

​Reducing Variance

​Short Replays

​Use Cases

​Regression Testing

​Profiling Bottlenecks

​Render Optimization

​CI Performance Tracking

​See Also

Build docs developers (and LLMs) love

Usage

Arguments

Options

Output Format

Human Format (default)

JSON Format

Metrics

Examples

Basic Headless Benchmark

Single Run

Render Mode Benchmark

With Profiling

Render Telemetry

JSON Output

Short Replay Segment

Profiling

Render Telemetry

Charts

Performance Tips

Headless vs Render

Reducing Variance

Short Replays

Use Cases

Regression Testing

Profiling Bottlenecks

Render Optimization

CI Performance Tracking

See Also