Classification with Perceptron

This guide builds a neural network classifier using a single perceptron with sigmoid activation. You’ll learn to separate data into classes, implement log loss, and train models on linearly separable datasets.

Prerequisites

Import required packages:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import colors
from sklearn.datasets import make_blobs

%matplotlib inline
np.random.seed(3)

Simple Classification Problem

Classification assigns observations to categories. Binary classification has exactly two categories.

Example: Sentiment Classification

Classify sentences as “happy” or “angry” based on word counts:

Count occurrences of “aack” ( $x_1$ ) and “beep” ( $x_2$ )
Rule: If $x_2 > x_1$ (more “beep”), classify as angry; otherwise, happy
This creates a linear decision boundary

Visualizing Linearly Separable Classes

Consider 4 sentences:

“Beep!” → (0, 1) → Angry
“Aack?” → (1, 0) → Happy
“Beep aack…” → (1, 1) → Happy
”!?” → (0, 0) → Happy

fig, ax = plt.subplots()
xmin, xmax = -0.2, 1.4
x_line = np.arange(xmin, xmax, 0.1)

# Data points (observations) from two classes
ax.scatter(0, 0, color="b")  # Happy
ax.scatter(0, 1, color="r")  # Angry
ax.scatter(1, 0, color="b")  # Happy
ax.scatter(1, 1, color="b")  # Happy

ax.set_xlim([xmin, xmax])
ax.set_ylim([-0.1, 1.1])
ax.set_xlabel('$x_1$ (aack count)')
ax.set_ylabel('$x_2$ (beep count)')

# Decision boundary: x₂ = x₁ + 0.5
ax.plot(x_line, x_line + 0.5, color="black")
plt.show()

Linearly Separable: Classes that can be separated by a straight line (or hyperplane in higher dimensions). This is the simplest classification scenario.

Finding the Decision Boundary

The line

x_1 - x_2 + 0.5 = 0

separates the classes:

Above the line: $x_1 - x_2 + 0.5 < 0$ → Red class
Below the line: $x_1 - x_2 + 0.5 > 0$ → Blue class

Goal: Find parameters

w_1

w_2

, and

b

in the equation

w_1x_1 + w_2x_2 + b = 0

that define this boundary.

For this simple example, we can see that

w_1 = 1

w_2 = -1

b = 0.5

. But for complex problems, we need a neural network!

Single Perceptron with Activation Function

Neural Network Structure

The perceptron performs two operations:

Linear combination: $z^{(i)} = w_1x_1^{(i)} + w_2x_2^{(i)} + b = Wx^{(i)} + b$
Activation function: $a^{(i)} = \sigma(z^{(i)})$

The activation function converts continuous values into class probabilities, enabling classification.

Sigmoid Activation Function

The sigmoid function maps any real number to the range (0, 1):

a = \sigma(z) = \frac{1}{1 + e^{-z}}

Properties:

$\sigma(0) = 0.5$
$\sigma(z) \to 1$ as $z \to \infty$
$\sigma(z) \to 0$ as $z \to -\infty$

Classification Rule

Use threshold 0.5:

\hat{y} = \begin{cases} 1 & \text{if } a > 0.5 \\ 0 & \text{otherwise} \end{cases}

Mathematical Model

For a single training example:

Linear Algebra

Calculus

Probability & Statistics

Classification with Perceptron

Prerequisites

Simple Classification Problem

Example: Sentiment Classification

Visualizing Linearly Separable Classes

Finding the Decision Boundary

Single Perceptron with Activation Function

Neural Network Structure

Sigmoid Activation Function

Classification Rule

Mathematical Model

Build docs developers (and LLMs) love

Linear Algebra

Calculus

Probability & Statistics

​Prerequisites

​Simple Classification Problem

​Example: Sentiment Classification

​Visualizing Linearly Separable Classes

​Finding the Decision Boundary

​Single Perceptron with Activation Function

​Neural Network Structure

​Sigmoid Activation Function

​Classification Rule

​Mathematical Model

Build docs developers (and LLMs) love

Prerequisites

Simple Classification Problem

Example: Sentiment Classification

Visualizing Linearly Separable Classes

Finding the Decision Boundary

Single Perceptron with Activation Function

Neural Network Structure

Sigmoid Activation Function

Classification Rule

Mathematical Model