Skip to main content

Prerequisites

Before you begin, make sure you have the following installed on your system:
You’ll need Python 3.7 or later with virtual environment support. This is used for running the cocotb testbenches.
Icarus Verilog is required to simulate the SystemVerilog hardware designs.
GTKWave is used to visualize signal waveforms from your simulations.

Installation

1

Create virtual environment

Create and activate a Python virtual environment:
python3 -m venv venv
source venv/bin/activate
2

Install cocotb

Install the cocotb testing framework:
pip install cocotb
3

Install Icarus Verilog

Use Homebrew to install the Verilog simulator:
brew install iverilog
4

Build GTKWave

On macOS, you must build GTKWave from source. Other installation methods currently do not work correctly.
Download and build GTKWave from the official repository.

Clone the repository

Get the Tiny TPU source code:
git clone https://github.com/yourusername/tiny-tpu.git
cd tiny-tpu

Run your first simulation

Let’s run a complete TPU simulation that performs forward and backward passes through a neural network.
1

Set environment variables

The Makefile automatically sets these required environment variables:
export COCOTB_REDUCED_LOG_FMT=1
export LIBPYTHON_LOC=$(cocotb-config --libpython)
export PYTHONPATH=test:$PYTHONPATH
You don’t need to set these manually — the Makefile handles this for you.
2

Run the TPU test

Execute the complete TPU simulation:
make test_tpu
This command:
  • Compiles all SystemVerilog source files from src/
  • Runs the testbench from test/test_tpu.py
  • Generates a waveform file tpu.vcd in the waveforms/ directory
  • Verifies that all tests pass
The test performs a complete XOR neural network training cycle with forward pass (matrix multiplication, bias addition, activation) and backward pass (gradient computation).
3

View the waveforms

Open the generated waveform in GTKWave:
make show_tpu
Or manually with:
gtkwave waveforms/tpu.vcd
You’ll see all the signals flowing through the TPU, including:
  • Input data moving horizontally through the systolic array
  • Partial sums flowing vertically
  • VPU activation functions processing data
  • Results being written back to the unified buffer

Understanding the test

The test_tpu.py testbench implements a complete XOR problem:
# Input data (XOR truth table)
X = np.array([[0., 0.],
              [0., 1.],
              [1., 0.],
              [1., 1.]])

# Expected output
Y = np.array([0, 1, 1, 0])

# First layer weights (2x2 matrix)
W1 = np.array([[0.2985, -0.5792], 
               [0.0913, 0.4234]])

# Second layer weights
W2 = np.array([0.5266, 0.2958])

# Learning rate
learning_rate = 0.75
leak_factor = 0.5  # For Leaky ReLU
The test:
  1. Loads matrices into the unified buffer
  2. Executes forward pass instructions through the systolic array and VPU
  3. Computes loss using MSE
  4. Performs backward pass to calculate gradients
  5. Updates weights using gradient descent
All values use 16-bit fixed-point arithmetic with 8 fractional bits, implemented in src/fixedpoint.sv.

Try other modules

Once you’ve run the complete TPU simulation, try testing individual components:
make test_pe
make show_pe

Configure GTKWave for fixed-point viewing

To properly view fixed-point values in GTKWave:
1

Select signals

Right-click the signals you want to view in fixed-point format
2

Set fixed-point shift

Navigate to: Data FormatFixed Point ShiftSpecifyEnter 8 (for 8 fractional bits) and click OK
3

Set signed decimal format

Set: Data FormatSigned Decimal
4

Enable fixed-point display

Enable: Data FormatFixed Point ShiftON
5

Save configuration

Save your GTKWave configuration:FileWrite Save FileSave as waveforms/tpu.gtkw
The .gtkw file stores your signal selection and formatting. You only need to configure this once — future runs of make show_tpu will load your saved configuration automatically.

What’s happening in the simulation?

When you run make test_tpu, here’s what occurs:
1

Compilation

Icarus Verilog compiles all SystemVerilog modules:
src/pe.sv              # Processing elements
src/systolic.sv        # Systolic array
src/vpu.sv             # Vector processing unit
src/unified_buffer.sv  # Memory buffer
src/control_unit.sv    # Instruction decoder
src/tpu.sv             # Top-level integration
2

Simulation

The cocotb testbench (test/test_tpu.py) runs:
  • Initializes the TPU with reset
  • Loads test data into the unified buffer
  • Issues 88-bit instructions to the control unit
  • Monitors outputs and verifies correctness
3

Waveform generation

The dump module (test/dump_tpu.sv) captures all signals:
module dump();
initial begin
  $dumpfile("waveforms/tpu.vcd");
  $dumpvars(0, tpu); 
end
endmodule
4

Verification

The Makefile checks for test failures:
! grep failure results.xml
If any assertions fail, the build will stop and report the error.

Next steps

Explore the architecture

Learn how the systolic array, VPU, and unified buffer work together

Understand the ISA

Dive into the 88-bit instruction set that controls the TPU

Add a new module

Learn the workflow for developing and testing new hardware modules

Debug with waveforms

Master GTKWave for analyzing signal timing and data flow

Troubleshooting

Make sure your virtual environment is activated:
source venv/bin/activate
Then verify cocotb is installed:
pip list | grep cocotb
Check the cocotb output for assertion errors. The test will show which expected values didn’t match the actual hardware output.Common issues:
  • Timing problems (data arriving on wrong clock cycle)
  • Fixed-point conversion errors
  • Incorrect instruction sequencing
Make sure you’ve configured fixed-point viewing correctly:
  • Fixed Point Shift: 8
  • Data Format: Signed Decimal
  • Fixed Point Shift: ON
Raw values in GTKWave are in 16-bit fixed-point format — you need to apply the 8-bit shift to see the actual decimal values.
Check that the waveforms/ directory exists:
mkdir -p waveforms
The Makefile creates this automatically, but if you’ve run make clean, you may need to recreate it.

Build docs developers (and LLMs) love