Loss Functions

MeanSquaredError

Mean Squared Error (MSE) loss function with optional L2 regularization. Commonly used for regression and can be adapted for classification.

Methods

forward

Computes the mean squared error loss with optional L2 regularization.

MeanSquaredError.forward(y_pred, y_true, weights=None, l2_lambda=0.0)

y_pred

ndarray

required

Predicted outputs of shape (batch_size, n_outputs).

y_true

ndarray

required

Ground truth labels of shape (batch_size, n_outputs). Should be one-hot encoded for classification.

weights

list[ndarray]

default:"None"

List of weight matrices from all layers. Required when l2_lambda > 0 for regularization.

l2_lambda

float

default:"0.0"

L2 regularization strength. When > 0, adds penalty term 0.5 * λ * Σ(w²) to the loss.

Returns: float - Scalar loss value. Formula:

Loss = 0.5 * mean(Σ((y_pred - y_true)²)) + 0.5 * λ * Σ(w²)
       └─────────────┬────────────────┘   └──────┬──────┘
              data loss                  regularization

backward

Computes the gradient of the loss with respect to predictions.

MeanSquaredError.backward(y_pred, y_true)

y_pred

ndarray

required

Predicted outputs of shape (batch_size, n_outputs).

y_true

ndarray

required

Ground truth labels of shape (batch_size, n_outputs).

Returns: ndarray - Gradient of loss with respect to predictions, shape (batch_size, n_outputs). Formula: ∂L/∂y_pred = y_pred - y_true

Usage Examples

Basic Loss Calculation

import numpy as np
from loss import MeanSquaredError

# Predictions and ground truth (one-hot encoded)
y_pred = np.array([
    [0.7, 0.2, 0.1],
    [0.1, 0.8, 0.1]
])

y_true = np.array([
    [1.0, 0.0, 0.0],  # True class: 0
    [0.0, 1.0, 0.0]   # True class: 1
])

# Compute loss
loss = MeanSquaredError.forward(y_pred, y_true)
print(f"Loss: {loss:.4f}")  # Loss: 0.0600

# Compute gradient
grad = MeanSquaredError.backward(y_pred, y_true)
print(grad)
# [[-0.3  0.2  0.1]
#  [ 0.1 -0.2  0.1]]

Loss with L2 Regularization

import numpy as np
from loss import MeanSquaredError

# Model predictions
y_pred = np.array([[0.7, 0.2, 0.1]])
y_true = np.array([[1.0, 0.0, 0.0]])

# Model weights from two layers
weights = [
    np.random.randn(10, 5),  # Layer 1 weights
    np.random.randn(5, 3)    # Layer 2 weights
]

# Loss without regularization
loss_no_reg = MeanSquaredError.forward(y_pred, y_true, weights=None, l2_lambda=0.0)
print(f"Loss (no reg): {loss_no_reg:.4f}")

# Loss with L2 regularization
loss_with_reg = MeanSquaredError.forward(y_pred, y_true, weights=weights, l2_lambda=0.01)
print(f"Loss (with reg): {loss_with_reg:.4f}")

Integration with Neural Network

from model import NeuralNetworkModel
import numpy as np

# Create model with L2 regularization
model = NeuralNetworkModel(
    layer_sizes=[784, 128, 10],
    activations=['relu', 'softmax'],
    l2_lambda=0.01  # L2 regularization coefficient
)

# Training data
X_train = np.random.randn(100, 784).astype(np.float32)
y_train = np.random.randint(0, 10, size=100)

# The model internally uses MeanSquaredError for loss calculation
history = model.fit(X_train, y_train, epochs=10, alpha=0.01)

# Access training loss (includes regularization)
print(f"Final training loss: {history['loss'][-1]:.4f}")

Custom Training Loop

import numpy as np
from model import NeuralNetworkModel
from loss import MeanSquaredError

model = NeuralNetworkModel(
    layer_sizes=[784, 64, 10],
    activations=['relu', 'softmax'],
    l2_lambda=0.001
)

# One training step
X_batch = np.random.randn(32, 784).astype(np.float32)
y_batch_onehot = np.eye(10)[np.random.randint(0, 10, 32)]  # One-hot encoded

# Forward pass
y_pred = model.forward(X_batch, training=True)

# Compute loss
loss = MeanSquaredError.forward(
    y_pred, 
    y_batch_onehot, 
    weights=model.weights,
    l2_lambda=0.001
)

print(f"Batch loss: {loss:.4f}")

# Compute gradients
grad = MeanSquaredError.backward(y_pred, y_batch_onehot)
print(f"Gradient shape: {grad.shape}")  # (32, 10)

Mathematical Details

MSE Loss

For a batch of size N with C output dimensions:

Loss = (1/2N) * Σᵢ Σⱼ (yᵢⱼ_pred - yᵢⱼ_true)²

The factor of 1/2 simplifies the gradient to y_pred - y_true.

L2 Regularization

Penalizes large weights to prevent overfitting:

Reg_loss = (λ/2) * Σₗ Σᵢⱼ wₗᵢⱼ²

Where λ is l2_lambda and w are the weights from all layers.

Gradient

The gradient with respect to predictions is:

∂L/∂y_pred = (y_pred - y_true) / N

Note: The implementation returns y_pred - y_true without the /N factor, as the averaging is handled during weight updates in the model’s backward pass.

Core Components

Configuration

Training & Evaluation

Analysis Tools

CLI Scripts

MeanSquaredError

Methods

forward

backward

Usage Examples

Basic Loss Calculation

Loss with L2 Regularization

Integration with Neural Network

Custom Training Loop

Mathematical Details

MSE Loss

L2 Regularization

Gradient

Build docs developers (and LLMs) love

Core Components

Configuration

Training & Evaluation

Analysis Tools

CLI Scripts

​MeanSquaredError

​Methods

​forward

​backward

​Usage Examples

​Basic Loss Calculation

​Loss with L2 Regularization

​Integration with Neural Network

​Custom Training Loop

​Mathematical Details

​MSE Loss

​L2 Regularization

​Gradient

Build docs developers (and LLMs) love

MeanSquaredError

Methods

forward

backward

Usage Examples

Basic Loss Calculation

Loss with L2 Regularization

Integration with Neural Network

Custom Training Loop

Mathematical Details

MSE Loss

L2 Regularization

Gradient