Neural Network Models

NeuralNetworkModel

The base neural network model class that supports flexible architecture configuration, multiple precision modes, and regularization techniques.

Constructor

NeuralNetworkModel(layer_sizes, activations, l2_lambda=0.0, dropout_rate=0.0, precision_config=None)

layer_sizes

list[int]

required

List of layer sizes including input and output dimensions. Must have at least 2 elements. Example: [784, 128, 64, 10] for input size 784, two hidden layers (128 and 64 units), and output size 10.

activations

list[str]

required

List of activation function names for each layer. Length must equal len(layer_sizes) - 1. Supported values: "sigmoid", "relu", "softmax".

l2_lambda

float

default:"0.0"

L2 regularization strength. Higher values increase regularization penalty.

dropout_rate

float

default:"0.0"

Dropout probability during training. Must be between 0 and 1.

precision_config

object

default:"None"

Configuration object for training and inference precision. Uses DEFAULT_CONFIG if not provided.

Methods

forward

Performs forward propagation through the network.

forward(x, training=False, precision=None)

ndarray

required

Input data of shape (batch_size, n_features).

training

bool

default:"False"

Whether to apply dropout during forward pass.

precision

str

default:"None"

Precision mode for inference: "float32", "float16", or "int8". Uses infer_precision from config if not specified.

Returns: ndarray - Output predictions of shape (batch_size, n_classes).

fit

Trains the model using mini-batch gradient descent.

fit(X, y, epochs=10, alpha=0.1, batch_size=32, save_path=None, shuffle=True, seed=42,
    X_val=None, y_val=None, patience=None, min_delta=0.0, restore_best=True)

ndarray

required

Training data of shape (n_samples, n_features).

ndarray

required

Training labels. Can be class indices or one-hot encoded.

epochs

int

default:"10"

Number of training epochs.

alpha

float

default:"0.1"

Learning rate for gradient descent.

batch_size

int

default:"32"

Mini-batch size for training.

save_path

str

default:"None"

Path to save model weights after training.

shuffle

bool

default:"True"

Whether to shuffle training data each epoch.

seed

int

default:"42"

Random seed for reproducibility.

X_val

ndarray

default:"None"

Validation data for early stopping.

y_val

ndarray

default:"None"

Validation labels for early stopping.

patience

int

default:"None"

Number of epochs with no improvement before early stopping.

min_delta

float

default:"0.0"

Minimum change in validation loss to qualify as improvement.

restore_best

bool

default:"True"

Whether to restore best weights after early stopping.

Returns: dict - Training history with keys "loss", "accuracy", and "val_loss".

predict

Generates class predictions for input data.

predict(x, precision=None)

ndarray

required

Input data of shape (batch_size, n_features).

precision

str

default:"None"

Precision mode: "float32", "float16", or "int8".

Returns: ndarray - Predicted class indices of shape (batch_size,).

save_weights

Saves model weights to a file.

save_weights(path="two_layer_weights.npz")

path

str

default:"two_layer_weights.npz"

Path to save weights file.

load_weights

Loads model weights from a file.

load_weights(path="two_layer_weights.npz")

path

str

default:"two_layer_weights.npz"

Path to weights file.

backprop

Performs backpropagation and updates weights.

backprop(x, y, alpha)

ndarray

required

Input batch of shape (batch_size, n_features).

ndarray

required

Target labels (one-hot encoded).

alpha

float

required

Learning rate.

gradient_check

Verifies gradient computation using numerical approximation.

gradient_check(X, y, epsilon=1e-5, num_checks=10)

ndarray

required

Input data for gradient checking.

ndarray

required

Target labels.

epsilon

float

default:"1e-5"

Small value for numerical gradient approximation.

num_checks

int

default:"10"

Number of random parameters to check.

Returns: dict - Maximum gradient errors for each parameter.

Properties

weights

list[ndarray]

List of weight matrices for all layers.

biases

list[ndarray]

List of bias vectors for all layers.

Usage Example

import numpy as np
from model import NeuralNetworkModel

# Create a 3-layer network: 784 -> 128 -> 64 -> 10
model = NeuralNetworkModel(
    layer_sizes=[784, 128, 64, 10],
    activations=['relu', 'relu', 'softmax'],
    l2_lambda=0.01,
    dropout_rate=0.2
)

# Train the model
history = model.fit(
    X_train, y_train,
    epochs=50,
    alpha=0.001,
    batch_size=64,
    X_val=X_val,
    y_val=y_val,
    patience=10
)

# Make predictions
predictions = model.predict(X_test)

# Save trained weights
model.save_weights('model_weights.npz')

NeuralNetwork

A convenience class that inherits from NeuralNetworkModel with identical functionality. Defined in student.py.

Constructor

NeuralNetwork(layer_sizes, activations, l2_lambda=0.0, dropout_rate=0.0, precision_config=DEFAULT_CONFIG)

Parameters and methods are identical to NeuralNetworkModel.

Usage Example

from student import NeuralNetwork

# Create and train a network
nn = NeuralNetwork(
    layer_sizes=[784, 128, 10],
    activations=['relu', 'softmax']
)

history = nn.fit(X_train, y_train, epochs=20, alpha=0.01)

Core Components

Configuration

Training & Evaluation

Analysis Tools

CLI Scripts