Installation

This guide covers environment setup, dependency installation, and verification for reproducible experiments.

System requirements

Before you begin, ensure your system meets these requirements:

Python: 3.8 or higher (3.10+ recommended)
Operating system: Linux, macOS, or Windows with Python support
RAM: 4GB minimum (8GB+ recommended for larger experiments)
CPU: Any modern x86_64 or ARM processor (CPU-focused, no GPU required)

This framework is designed for CPU execution and reproducible research. It does not require specialized hardware like GPUs or TPUs.

Clone the repository

Clone the project to your local machine:

git clone <repository-url>
cd source

Create a virtual environment (recommended)

Isolate your dependencies using a virtual environment:

python -m venv venv
source venv/bin/activate

You should see (venv) in your terminal prompt after activation.

Install required dependencies

Install the core packages from requirements.txt:

pip install -r requirements.txt

This installs:

numpy (1.26.4): Core array operations and tensor math
pandas (2.2.2): Dataset loading and CSV handling
matplotlib (3.8.4): Visualization for training curves and benchmarks
psutil (5.9.8): Memory profiling and system resource monitoring
requests (2.32.3): Dataset download utilities
tqdm (4.66.4): Progress bars for training and data loading

Install optional dependencies (optional)

For framework comparison and ONNX export, install the optional packages:

pip install torch==2.3.1 onnx==1.16.1 onnxruntime==1.18.1

These dependencies are large (PyTorch is ~800MB). Only install them if you need framework comparison features or ONNX export.

The framework works fully without these packages—they’re guarded by runtime checks.

Install development dependencies (optional)

For testing and development, install the dev requirements:

pip install -r requirements-dev.txt

This includes pytest and other tools for running the test suite.

Verify your installation

After installing dependencies, verify your environment is configured correctly.

Run the verification script

Execute the environment verification script:

python scripts/verify_environment.py

Expected output

You should see output similar to this:

Python: 3.11.0
Platform: Linux-5.15.0-x86_64-with-glibc2.35

Required packages:
  - numpy: OK (1.26.4)
  - matplotlib: OK (3.8.4)
  - psutil: OK (5.9.8)
  - requests: OK (2.32.3)
  - tqdm: OK (4.66.4)

Optional packages:
  - torch: OK (2.3.1)
  - pytest: OK (7.4.0)

Dataset status:
  - fashion-mnist: train_exists=False size=0, test_exists=False size=0

Interpreting the output

Required packages: All must show OK with version numbers
Optional packages: Can show MISSING if you didn’t install them
Dataset status: Shows False until you download Fashion-MNIST (see below)

If any required package shows MISSING, re-run pip install -r requirements.txt and check for installation errors.

Understanding the verification script

The verification script checks your environment systematically:

scripts/verify_environment.py

REQUIRED = ["numpy", "matplotlib", "psutil", "requests", "tqdm"]
OPTIONAL = ["torch", "pytest"]

def check_module(name: str) -> str:
    try:
        mod = importlib.import_module(name)
        version = getattr(mod, "__version__", "unknown")
        return f"OK ({version})"
    except Exception as exc:
        return f"MISSING ({exc})"

It verifies:

Python version and platform information
All required packages are importable with correct versions
Optional packages (if installed)
Dataset files (if downloaded)

Dataset preparation

The framework supports two data modes:

Synthetic mode (default)

No setup required. The framework generates random data for fast iteration:

python scripts/run_workflow.py --mode train --experiment baseline

Synthetic mode uses deterministic random generation with fixed seeds for reproducibility.

Real data mode (Fashion-MNIST)

For production-like experiments, download the Fashion-MNIST dataset:

Download Fashion-MNIST

Run the download script:

python scripts/download_fashion_mnist.py --out-dir "Neural Network from Scratch/task/Data"

This downloads two CSV files:

fashion-mnist_train.csv (~120MB): 60,000 training samples
fashion-mnist_test.csv (~20MB): 10,000 test samples

Verify the download

Re-run the verification script to confirm the dataset is ready:

python scripts/verify_environment.py

You should see:

Dataset status:
  - fashion-mnist: train_exists=True size=123456789, test_exists=True size=12345678

Use the dataset in experiments

Run experiments with the real_fashion_mnist configuration:

python scripts/run_workflow.py --mode full --experiment real_fashion_mnist

The download script validates file integrity using SHA256 checksums and includes retry logic for network failures.

Dependency versions and reproducibility

The framework uses pinned dependency versions for reproducible experiments:

requirements.txt

numpy==1.26.4
pandas==2.2.2
matplotlib==3.8.4
psutil==5.9.8
requests==2.32.3
tqdm==4.66.4

Using different versions may produce different numerical results due to changes in random number generation, floating-point operations, or algorithm implementations.

Why pinned versions matter

For reproducible research:

Numerical stability: NumPy versions can differ in floating-point precision
API compatibility: Avoid breaking changes in dependencies
Deterministic results: Same code + same versions + same seed = same output
Experiment comparison: Compare results across time and machines reliably

If you need to upgrade dependencies, update requirements.txt and re-run all baseline experiments to establish new reference results.

Project structure after installation

After installation, your directory structure looks like this:

source/
├── Neural Network from Scratch/
│   └── task/
│       ├── train.py           # Training entrypoint
│       ├── benchmark.py       # Performance benchmarking
│       ├── student.py         # Neural network implementation
│       ├── config.py          # Experiment configurations
│       ├── dataset_config.py  # Dataset specifications
│       └── Data/              # Dataset directory (after download)
├── scripts/
│   ├── verify_environment.py  # Environment checker
│   ├── run_workflow.py        # Workflow orchestration
│   └── download_fashion_mnist.py  # Dataset downloader
├── experiments/               # Generated logs and checkpoints
├── artifacts/                 # Generated reports
├── requirements.txt
└── requirements-dev.txt

Troubleshooting

pip install fails with version conflicts

Upgrade pip and try again:

pip install --upgrade pip
pip install -r requirements.txt

If conflicts persist, create a fresh virtual environment.

NumPy import fails on macOS ARM (M1/M2)

Ensure you’re using an ARM-native Python installation:

python -c "import platform; print(platform.machine())"

Should output arm64. If it shows x86_64, reinstall Python for Apple Silicon.

verify_environment.py shows 'MISSING' for required packages

Check if you activated your virtual environment:

which python  # Should point to venv/bin/python

If not, activate it:

source venv/bin/activate  # Linux/macOS
venv\Scripts\activate      # Windows

Dataset download fails or times out

Manually download the files from the mirrors and place them in the correct directory:

mkdir -p "Neural Network from Scratch/task/Data"
cd "Neural Network from Scratch/task/Data"
# Download fashion-mnist_train.csv and fashion-mnist_test.csv manually

Then verify the files are recognized:

python scripts/verify_environment.py

Python version too old

You need Python 3.8 or higher. Check your version:

python --version

If it’s too old, install a newer version:

Linux: Use pyenv or your distribution’s package manager
macOS: Use brew install [email protected]
Windows: Download from python.org

Next steps

Now that your environment is set up:

Run your first experiment

Train a model and see benchmark results in under 5 minutes

Understand the architecture

Learn how the framework is structured

Configure experiments

Customize layer sizes, precision, and constraints

Explore the API

Dive into the module-level documentation

Best practices

For the best experience:

Use virtual environments

Always isolate project dependencies to avoid conflicts:

python -m venv venv
source venv/bin/activate

Pin dependency versions

Never use pip install package without version pins in production experiments. Always use requirements.txt.

Verify after installation

Run scripts/verify_environment.py after any environment changes to catch issues early.

Use fixed seeds

All experiments accept a --seed parameter. Use it for reproducible results:

python scripts/run_workflow.py --seed 42 --experiment baseline

Keep a copy of your requirements.txt with each experiment log for full reproducibility.

Get Started

Core Concepts

Training & Experiments

Analysis & Profiling

Deployment

Installation

System requirements

Installation

Verify your installation

Run the verification script

Expected output

Interpreting the output

Understanding the verification script

Dataset preparation

Synthetic mode (default)

Real data mode (Fashion-MNIST)

Dependency versions and reproducibility

Why pinned versions matter

Project structure after installation

Troubleshooting

Next steps

Run your first experiment

Understand the architecture

Configure experiments

Explore the API

Best practices

Build docs developers (and LLMs) love

Get Started

Core Concepts

Training & Experiments

Analysis & Profiling

Deployment

​System requirements

​Installation

​Verify your installation

​Run the verification script

​Expected output

​Interpreting the output

​Understanding the verification script

​Dataset preparation

​Synthetic mode (default)

​Real data mode (Fashion-MNIST)

​Dependency versions and reproducibility

​Why pinned versions matter

​Project structure after installation

​Troubleshooting

​Next steps

Run your first experiment

Understand the architecture

Configure experiments

Explore the API

​Best practices

Build docs developers (and LLMs) love

System requirements

Installation

Verify your installation

Run the verification script

Expected output

Interpreting the output

Understanding the verification script

Dataset preparation

Synthetic mode (default)

Real data mode (Fashion-MNIST)

Dependency versions and reproducibility

Why pinned versions matter

Project structure after installation

Troubleshooting

Next steps

Best practices