Skip to main content

NeuralNetworkModel

The base neural network model class that supports flexible architecture configuration, multiple precision modes, and regularization techniques.

Constructor

NeuralNetworkModel(layer_sizes, activations, l2_lambda=0.0, dropout_rate=0.0, precision_config=None)
layer_sizes
list[int]
required
List of layer sizes including input and output dimensions. Must have at least 2 elements. Example: [784, 128, 64, 10] for input size 784, two hidden layers (128 and 64 units), and output size 10.
activations
list[str]
required
List of activation function names for each layer. Length must equal len(layer_sizes) - 1. Supported values: "sigmoid", "relu", "softmax".
l2_lambda
float
default:"0.0"
L2 regularization strength. Higher values increase regularization penalty.
dropout_rate
float
default:"0.0"
Dropout probability during training. Must be between 0 and 1.
precision_config
object
default:"None"
Configuration object for training and inference precision. Uses DEFAULT_CONFIG if not provided.

Methods

forward

Performs forward propagation through the network.
forward(x, training=False, precision=None)
x
ndarray
required
Input data of shape (batch_size, n_features).
training
bool
default:"False"
Whether to apply dropout during forward pass.
precision
str
default:"None"
Precision mode for inference: "float32", "float16", or "int8". Uses infer_precision from config if not specified.
Returns: ndarray - Output predictions of shape (batch_size, n_classes).

fit

Trains the model using mini-batch gradient descent.
fit(X, y, epochs=10, alpha=0.1, batch_size=32, save_path=None, shuffle=True, seed=42,
    X_val=None, y_val=None, patience=None, min_delta=0.0, restore_best=True)
X
ndarray
required
Training data of shape (n_samples, n_features).
y
ndarray
required
Training labels. Can be class indices or one-hot encoded.
epochs
int
default:"10"
Number of training epochs.
alpha
float
default:"0.1"
Learning rate for gradient descent.
batch_size
int
default:"32"
Mini-batch size for training.
save_path
str
default:"None"
Path to save model weights after training.
shuffle
bool
default:"True"
Whether to shuffle training data each epoch.
seed
int
default:"42"
Random seed for reproducibility.
X_val
ndarray
default:"None"
Validation data for early stopping.
y_val
ndarray
default:"None"
Validation labels for early stopping.
patience
int
default:"None"
Number of epochs with no improvement before early stopping.
min_delta
float
default:"0.0"
Minimum change in validation loss to qualify as improvement.
restore_best
bool
default:"True"
Whether to restore best weights after early stopping.
Returns: dict - Training history with keys "loss", "accuracy", and "val_loss".

predict

Generates class predictions for input data.
predict(x, precision=None)
x
ndarray
required
Input data of shape (batch_size, n_features).
precision
str
default:"None"
Precision mode: "float32", "float16", or "int8".
Returns: ndarray - Predicted class indices of shape (batch_size,).

save_weights

Saves model weights to a file.
save_weights(path="two_layer_weights.npz")
path
str
default:"two_layer_weights.npz"
Path to save weights file.

load_weights

Loads model weights from a file.
load_weights(path="two_layer_weights.npz")
path
str
default:"two_layer_weights.npz"
Path to weights file.

backprop

Performs backpropagation and updates weights.
backprop(x, y, alpha)
x
ndarray
required
Input batch of shape (batch_size, n_features).
y
ndarray
required
Target labels (one-hot encoded).
alpha
float
required
Learning rate.

gradient_check

Verifies gradient computation using numerical approximation.
gradient_check(X, y, epsilon=1e-5, num_checks=10)
X
ndarray
required
Input data for gradient checking.
y
ndarray
required
Target labels.
epsilon
float
default:"1e-5"
Small value for numerical gradient approximation.
num_checks
int
default:"10"
Number of random parameters to check.
Returns: dict - Maximum gradient errors for each parameter.

Properties

weights
list[ndarray]
List of weight matrices for all layers.
biases
list[ndarray]
List of bias vectors for all layers.

Usage Example

import numpy as np
from model import NeuralNetworkModel

# Create a 3-layer network: 784 -> 128 -> 64 -> 10
model = NeuralNetworkModel(
    layer_sizes=[784, 128, 64, 10],
    activations=['relu', 'relu', 'softmax'],
    l2_lambda=0.01,
    dropout_rate=0.2
)

# Train the model
history = model.fit(
    X_train, y_train,
    epochs=50,
    alpha=0.001,
    batch_size=64,
    X_val=X_val,
    y_val=y_val,
    patience=10
)

# Make predictions
predictions = model.predict(X_test)

# Save trained weights
model.save_weights('model_weights.npz')

NeuralNetwork

A convenience class that inherits from NeuralNetworkModel with identical functionality. Defined in student.py.

Constructor

NeuralNetwork(layer_sizes, activations, l2_lambda=0.0, dropout_rate=0.0, precision_config=DEFAULT_CONFIG)
Parameters and methods are identical to NeuralNetworkModel.

Usage Example

from student import NeuralNetwork

# Create and train a network
nn = NeuralNetwork(
    layer_sizes=[784, 128, 10],
    activations=['relu', 'softmax']
)

history = nn.fit(X_train, y_train, epochs=20, alpha=0.01)

Build docs developers (and LLMs) love