Overview
Activation functions introduce non-linearity into neural networks. This module provides three common activation functions with their derivatives.Activation Functions
sigmoid
Computes the sigmoid activation function element-wise.Input array. Values are clipped to
[-500, 500] to prevent overflow.ndarray - Sigmoid activation σ(x) = 1 / (1 + e^(-x)), values in range (0, 1).
Formula: σ(x) = 1 / (1 + e^(-x))
relu
Rectified Linear Unit activation function.Input array.
ndarray - ReLU activation max(0, x).
Formula: ReLU(x) = max(0, x)
softmax
Computes softmax activation for multi-class classification.Input array of shape
(batch_size, n_classes). Numerically stable implementation using max subtraction.ndarray - Softmax probabilities, each row sums to 1.
Formula: softmax(x_i) = e^(x_i) / Σ(e^(x_j))
Activation Derivatives
sigmoid_derivative_from_activation
Computes sigmoid derivative from the activated output.Already-activated sigmoid output
σ(x).ndarray - Derivative σ(x) * (1 - σ(x)).
relu_derivative_from_pre_activation
Computes ReLU derivative from pre-activation values.Pre-activation values.
ndarray - Derivative: 1 where z > 0, else 0.
High-Level Interface
activation_forward
Applies activation function by name.Input array.
Activation function name:
"sigmoid", "relu", or "softmax".ndarray - Activated output.
activation_backward
Computes activation derivative by name.Pre-activation values.
Activated output.
Activation function name:
"sigmoid", "relu", or "softmax".ndarray - Activation gradient.
ACTIVATIONS Dictionary
Internal dictionary mapping activation names to forward/backward functions:Usage Examples
Direct Usage
Using High-Level Interface
In Neural Network Context
Constants
Small epsilon value used in softmax to prevent division by zero.