The machine supports multiple execution engines that can run in parallel within a single instruction bundle. Each engine has specific slot limits defined in
SLOT_LIMITS.Slot Limits
Each engine can execute a limited number of operations per cycle:| Engine | Slots per Cycle |
|---|---|
alu | 12 |
valu | 6 |
load | 2 |
store | 2 |
flow | 1 |
debug | 64 |
From
problem.py:48-55 - SLOT_LIMITS dictionary defines the maximum parallel operations per engine.ALU Engine
The Arithmetic Logic Unit performs scalar operations on 32-bit words.Operations
| Operation | Format | Description |
|---|---|---|
+ | ("+", dest, a, b) | Addition: dest = a + b |
- | ("-", dest, a, b) | Subtraction: dest = a - b |
* | ("*", dest, a, b) | Multiplication: dest = a * b |
// | ("//", dest, a, b) | Integer division: dest = a // b |
cdiv | ("cdiv", dest, a, b) | Ceiling division: dest = (a + b - 1) // b |
^ | ("^", dest, a, b) | Bitwise XOR: dest = a ^ b |
& | ("&", dest, a, b) | Bitwise AND: dest = a & b |
| | ("|", dest, a, b) | Bitwise OR: dest = a | b |
<< | ("<<", dest, a, b) | Left shift: dest = a << b |
>> | (">>", dest, a, b) | Right shift: dest = a >> b |
% | ("%", dest, a, b) | Modulo: dest = a % b |
< | ("<", dest, a, b) | Less than: dest = 1 if a < b else 0 |
== | ("==", dest, a, b) | Equality: dest = 1 if a == b else 0 |
All ALU operations wrap results modulo 2^32. See
problem.py:219-252.Example
VALU Engine
Vector ALU performs SIMD operations on vectors ofVLEN=8 elements.
Operations
| Operation | Format | Description |
|---|---|---|
vbroadcast | ("vbroadcast", dest, src) | Broadcast scalar to vector: dest[i] = src for all i |
multiply_add | ("multiply_add", dest, a, b, c) | Fused multiply-add: dest[i] = (a[i] * b[i]) + c[i] |
| Vector ops | (op, dest, a, b) | Apply ALU op element-wise: dest[i] = a[i] op b[i] |
Vector operations apply the same ALU operations element-wise across
VLEN=8 contiguous scratch addresses. See problem.py:254-267.Example
LOAD Engine
Loads data from main memory into scratch space.Operations
| Operation | Format | Description |
|---|---|---|
load | ("load", dest, addr) | Load single word: dest = mem[scratch[addr]] |
load_offset | ("load_offset", dest, addr, offset) | Load with offset: dest+offset = mem[scratch[addr+offset]] |
vload | ("vload", dest, addr) | Vector load: Load 8 words from mem[scratch[addr]:scratch[addr]+8] |
const | ("const", dest, val) | Load immediate: dest = val |
The
addr parameter is always a scratch address (indirect). The actual memory address is read from scratch. See problem.py:269-286.Example
STORE Engine
Stores data from scratch space to main memory.Operations
| Operation | Format | Description |
|---|---|---|
store | ("store", addr, src) | Store single word: mem[scratch[addr]] = scratch[src] |
vstore | ("vstore", addr, src) | Vector store: Store 8 words from scratch to mem[scratch[addr]:scratch[addr]+8] |
Store operations write to memory at the end of the cycle after all reads complete. See
problem.py:288-298.Example
FLOW Engine
Controls program flow, conditional operations, and core state.Operations
| Operation | Format | Description |
|---|---|---|
select | ("select", dest, cond, a, b) | Conditional: dest = a if cond != 0 else b |
add_imm | ("add_imm", dest, a, imm) | Add immediate: dest = a + imm |
vselect | ("vselect", dest, cond, a, b) | Vector select: dest[i] = a[i] if cond[i] != 0 else b[i] |
halt | ("halt",) | Stop core execution |
pause | ("pause",) | Pause core (for debugging) |
trace_write | ("trace_write", val) | Write value to trace buffer |
jump | ("jump", addr) | Unconditional jump: pc = addr |
jump_indirect | ("jump_indirect", addr) | Indirect jump: pc = scratch[addr] |
cond_jump | ("cond_jump", cond, addr) | Conditional jump: pc = addr if scratch[cond] != 0 |
cond_jump_rel | ("cond_jump_rel", cond, offset) | Relative conditional jump: pc += offset if scratch[cond] != 0 |
coreid | ("coreid", dest) | Get core ID: dest = core.id |
The flow engine has only 1 slot - only one flow operation can execute per cycle. Jump instructions take effect immediately. See
problem.py:300-335.Example
DEBUG Engine
Debugging and assertion operations (not counted as cycles).Operations
| Operation | Format | Description |
|---|---|---|
compare | ("compare", loc, key) | Assert scratch[loc] == value_trace[key] |
vcompare | ("vcompare", loc, keys) | Assert vector matches expected values |
comment | Any other format | Ignored (for documentation) |
Debug instructions don’t consume cycles and can be disabled with
enable_debug=False. See problem.py:366-382.Example
Engine Execution Model
Related
Instruction Format
Learn how to construct instruction bundles
Architecture Overview
Understand the VLIW SIMD architecture