Activation Functions

Overview

Activation functions introduce non-linearity into neural networks. This module provides three common activation functions with their derivatives.

sigmoid

Computes the sigmoid activation function element-wise.

sigmoid(x)

ndarray

required

Input array. Values are clipped to [-500, 500] to prevent overflow.

Returns: ndarray - Sigmoid activation σ(x) = 1 / (1 + e^(-x)), values in range (0, 1). Formula: σ(x) = 1 / (1 + e^(-x))

relu

Rectified Linear Unit activation function.

relu(x)

ndarray

required

Input array.

Returns: ndarray - ReLU activation max(0, x). Formula: ReLU(x) = max(0, x)

softmax

Computes softmax activation for multi-class classification.

softmax(x)

ndarray

required

Input array of shape (batch_size, n_classes). Numerically stable implementation using max subtraction.

Returns: ndarray - Softmax probabilities, each row sums to 1. Formula: softmax(x_i) = e^(x_i) / Σ(e^(x_j))

Activation Derivatives

sigmoid_derivative_from_activation

Computes sigmoid derivative from the activated output.

sigmoid_derivative_from_activation(activated)

activated

ndarray

required

Already-activated sigmoid output σ(x).

Returns: ndarray - Derivative σ(x) * (1 - σ(x)).

relu_derivative_from_pre_activation

Computes ReLU derivative from pre-activation values.

relu_derivative_from_pre_activation(z)

ndarray

required

Pre-activation values.

Returns: ndarray - Derivative: 1 where z > 0, else 0.

High-Level Interface

activation_forward

Applies activation function by name.

activation_forward(x, name)

ndarray

required

Input array.

name

str

required

Activation function name: "sigmoid", "relu", or "softmax".

Returns: ndarray - Activated output.

activation_backward

Computes activation derivative by name.

activation_backward(z, a, name)

ndarray

required

Pre-activation values.

ndarray

required

Activated output.

name

str

required

Activation function name: "sigmoid", "relu", or "softmax".

Returns: ndarray - Activation gradient.

ACTIVATIONS Dictionary

Internal dictionary mapping activation names to forward/backward functions:

ACTIVATIONS = {
    "sigmoid": {
        "forward": sigmoid,
        "backward": lambda z, a: sigmoid_derivative_from_activation(a),
    },
    "relu": {
        "forward": relu,
        "backward": lambda z, a: relu_derivative_from_pre_activation(z),
    },
    "softmax": {
        "forward": softmax,
        "backward": lambda z, a: np.ones_like(a),
    },
}

Usage Examples

Direct Usage

import numpy as np
from activations import sigmoid, relu, softmax

# Sigmoid activation
x = np.array([[1.0, -2.0, 3.0]])
activated = sigmoid(x)
print(activated)  # [[0.731, 0.119, 0.953]]

# ReLU activation
x = np.array([[-1.0, 0.0, 2.0]])
activated = relu(x)
print(activated)  # [[0.0, 0.0, 2.0]]

# Softmax for classification
logits = np.array([[2.0, 1.0, 0.1]])
probs = softmax(logits)
print(probs)  # [[0.659, 0.242, 0.099]]
print(probs.sum())  # 1.0

Using High-Level Interface

from activations import activation_forward, activation_backward
import numpy as np

# Forward pass
z = np.array([[1.0, -1.0, 2.0]])
a = activation_forward(z, "relu")
print(a)  # [[1.0, 0.0, 2.0]]

# Backward pass
grad = activation_backward(z, a, "relu")
print(grad)  # [[1.0, 0.0, 1.0]]

In Neural Network Context

from model import NeuralNetworkModel
import numpy as np

# Build network with different activations
model = NeuralNetworkModel(
    layer_sizes=[784, 128, 64, 10],
    activations=['relu', 'relu', 'softmax']
)

X = np.random.randn(32, 784).astype(np.float32)
output = model.forward(X)  # Shape: (32, 10)
print(output[0].sum())  # ~1.0 (softmax output sums to 1)

Constants

EPS

float

Small epsilon value used in softmax to prevent division by zero.

Core Components

Configuration

Training & Evaluation

Analysis Tools

CLI Scripts

Activation Functions

Overview