Prerequisites
Before you begin, make sure you have the following installed on your system:Python environment
Python environment
You’ll need Python 3.7 or later with virtual environment support. This is used for running the cocotb testbenches.
Verilog simulator
Verilog simulator
Icarus Verilog is required to simulate the SystemVerilog hardware designs.
Waveform viewer
Waveform viewer
GTKWave is used to visualize signal waveforms from your simulations.
Installation
- macOS
- Ubuntu/Linux
Build GTKWave
Download and build GTKWave from the official repository.
Clone the repository
Get the Tiny TPU source code:Run your first simulation
Let’s run a complete TPU simulation that performs forward and backward passes through a neural network.Set environment variables
The Makefile automatically sets these required environment variables:
You don’t need to set these manually — the Makefile handles this for you.
Run the TPU test
Execute the complete TPU simulation:This command:
- Compiles all SystemVerilog source files from
src/ - Runs the testbench from
test/test_tpu.py - Generates a waveform file
tpu.vcdin thewaveforms/directory - Verifies that all tests pass
The test performs a complete XOR neural network training cycle with forward pass (matrix multiplication, bias addition, activation) and backward pass (gradient computation).
View the waveforms
Open the generated waveform in GTKWave:Or manually with:You’ll see all the signals flowing through the TPU, including:
- Input data moving horizontally through the systolic array
- Partial sums flowing vertically
- VPU activation functions processing data
- Results being written back to the unified buffer
Understanding the test
Thetest_tpu.py testbench implements a complete XOR problem:
- Loads matrices into the unified buffer
- Executes forward pass instructions through the systolic array and VPU
- Computes loss using MSE
- Performs backward pass to calculate gradients
- Updates weights using gradient descent
All values use 16-bit fixed-point arithmetic with 8 fractional bits, implemented in
src/fixedpoint.sv.Try other modules
Once you’ve run the complete TPU simulation, try testing individual components:Configure GTKWave for fixed-point viewing
To properly view fixed-point values in GTKWave:Set fixed-point shift
Navigate to: Data Format → Fixed Point Shift → SpecifyEnter
8 (for 8 fractional bits) and click OKWhat’s happening in the simulation?
When you runmake test_tpu, here’s what occurs:
Simulation
The cocotb testbench (
test/test_tpu.py) runs:- Initializes the TPU with reset
- Loads test data into the unified buffer
- Issues 88-bit instructions to the control unit
- Monitors outputs and verifies correctness
Next steps
Explore the architecture
Learn how the systolic array, VPU, and unified buffer work together
Understand the ISA
Dive into the 88-bit instruction set that controls the TPU
Add a new module
Learn the workflow for developing and testing new hardware modules
Debug with waveforms
Master GTKWave for analyzing signal timing and data flow
Troubleshooting
Command not found: cocotb-config
Command not found: cocotb-config
Make sure your virtual environment is activated:Then verify cocotb is installed:
Test failures in results.xml
Test failures in results.xml
Check the cocotb output for assertion errors. The test will show which expected values didn’t match the actual hardware output.Common issues:
- Timing problems (data arriving on wrong clock cycle)
- Fixed-point conversion errors
- Incorrect instruction sequencing
GTKWave shows incorrect values
GTKWave shows incorrect values
Make sure you’ve configured fixed-point viewing correctly:
- Fixed Point Shift: 8
- Data Format: Signed Decimal
- Fixed Point Shift: ON
Waveform file not generated
Waveform file not generated
Check that the The Makefile creates this automatically, but if you’ve run
waveforms/ directory exists:make clean, you may need to recreate it.