Skip to main content
This guide gets you from zero to a trained model with benchmark results. You’ll use synthetic data for fast iteration and see how the framework exposes hardware trade-offs.

Before you begin

You need Python 3.8+ and pip installed on your system.

Train your first model

1

Clone the repository

Clone the project and navigate to the source directory:
git clone <repository-url>
cd source
2

Install dependencies

Install the required packages:
pip install -r requirements.txt
This installs NumPy, pandas, matplotlib, psutil, and other core dependencies.
3

Verify your environment

Run the environment verification script to confirm everything is set up correctly:
python scripts/verify_environment.py
You should see output showing your Python version, platform, and package status:
Python: 3.11.0
Platform: Linux-5.15.0-x86_64

Required packages:
  - numpy: OK (1.26.4)
  - matplotlib: OK (3.8.4)
  - psutil: OK (5.9.8)
  - requests: OK (2.32.3)
  - tqdm: OK (4.66.4)
The dataset status may show files as unavailable—that’s expected for synthetic mode.
4

Run your first training workflow

Execute the full workflow with synthetic data:
python scripts/run_workflow.py --mode full --experiment baseline --stats-repeats 5
This command:
  • Trains a neural network with the baseline configuration
  • Runs benchmarks to measure latency and memory
  • Performs statistical analysis over 5 repeated runs
The baseline experiment uses synthetic data by default, so no dataset download is required.
5

Review your results

After training completes, you’ll find:
  • Checkpoints: Saved model weights in experiments/logs/
  • Metrics: Training loss, validation accuracy, and convergence data
  • Benchmarks: Latency measurements, memory footprint, and throughput statistics
  • Reports: Statistical summaries and artifacts in artifacts/
The workflow prints summary statistics to the console, including:
  • Parameter count vs. memory footprint
  • Inference latency across different batch sizes
  • Accuracy under various precision constraints

What you just built

You trained a feed-forward neural network that exposes system-level trade-offs:
  • Parameter count vs. memory: See how model size affects RAM usage
  • Precision vs. accuracy: Compare float32, float16, and int8 representations
  • Batch size vs. latency: Understand throughput vs. per-sample inference time
The baseline configuration uses a simple architecture designed for CPU execution and reproducible experiments.

Next steps

Train with real data

Download Fashion-MNIST and train on production-like data

Run specific stages

Execute training, benchmarking, or analysis independently

Customize experiments

Modify layer sizes, precision, and hardware constraints

Understand the architecture

Learn how the framework is structured

Train with Fashion-MNIST

To train on real data instead of synthetic samples:
1

Download the dataset

python scripts/download_fashion_mnist.py --out-dir "Neural Network from Scratch/task/Data"
This downloads the Fashion-MNIST CSV files and verifies their integrity.
2

Run the workflow with real data

python scripts/run_workflow.py --mode full --experiment real_fashion_mnist --stats-repeats 5
The real_fashion_mnist experiment configuration uses the downloaded dataset instead of synthetic data.

Stage-specific runs

You can run individual stages of the workflow independently:
python scripts/run_workflow.py --mode train --experiment baseline
Benchmarking requires a trained model checkpoint. Run training first if no checkpoint exists.

Understanding the workflow modes

The run_workflow.py script supports three modes:
ModeWhat it doesUse case
trainTrains the model and saves checkpointsIterate on model architecture
benchmarkRuns inference benchmarks and statistical analysisProfile performance changes
fullExecutes train → benchmark → analysisEnd-to-end experiment

Troubleshooting

Ensure you’re in the correct directory and Python can find the task modules:
cd source
python -c "import sys; print(sys.path)"
The scripts automatically add the task directory to your Python path.
Reduce the batch size in your experiment configuration or use a smaller model architecture. The framework will show memory allocation spikes during forward/backward passes.
If you’re using real data, re-download the Fashion-MNIST files:
python scripts/download_fashion_mnist.py --out-dir "Neural Network from Scratch/task/Data"
For synthetic mode, no dataset files are required.

Key concepts

Before diving deeper, understand these core concepts:
  • Synthetic mode: Uses generated random data for fast, deterministic testing
  • Experiment configs: Pre-defined settings in config.py that control architecture and training
  • Reproducibility: All experiments use explicit seeds and fixed dependency versions
  • Hardware constraints: Simulated precision and memory limits without specialized hardware
The framework prioritizes transparency over raw performance—you can inspect every tensor operation and understand exactly what’s happening.

Build docs developers (and LLMs) love