MLP Classifier

Overview

The MLP class implements a simple multi-layer perceptron (feedforward neural network) used to classify TikTok videos based on their extracted embeddings. The model consists of two hidden layers with ReLU activations and dropout regularization. Defined in train.py:31-45 and predict.py:27-41.

Class Definition

class MLP(torch.nn.Module):
    def __init__(self, input_dim, num_classes, hidden_dim=256):
        # Architecture detailed below

Constructor Parameters

input_dim

int

required

The dimensionality of input features. For this project, this is typically 1024 (512-d CLIP visual + 512-d CLIP text embeddings concatenated).

num_classes

int

required

The number of output classes (TikTok folders/categories to predict). This corresponds to the number of folder categories in your labeled dataset.

hidden_dim

int

default:"256"

The size of the first hidden layer. The second hidden layer will be hidden_dim // 2 (integer division).Default: 256 (second layer becomes 128)

Architecture

The MLP uses a sequential architecture with the following layers:

Layer 1

Linear

Input: input_dim → Output: hidden_dimFirst linear transformation layer.

Layer 2

ReLU

Activation function introducing non-linearity.

Layer 3

Dropout(0.3)

Dropout regularization with 30% probability during training.

Layer 4

Linear

Input: hidden_dim → Output: hidden_dim // 2Second linear transformation layer (e.g., 256 → 128 with default hidden_dim).

Layer 5

ReLU

Second activation function.

Layer 6

Dropout(0.2)

Dropout regularization with 20% probability during training.

Layer 7

Linear

Input: hidden_dim // 2 → Output: num_classesFinal output layer producing logits for each class.

Forward Pass

def forward(self, x):
    return self.net(x)

torch.Tensor

Input tensor

Show Tensor details

Shape: (batch_size, input_dim) or (input_dim,) for single sample
Type: torch.FloatTensor
Example shape: (32, 1024) for batch of 32 videos with 1024-d features

returns

torch.Tensor

Output logits

Show Tensor details

Shape: (batch_size, num_classes) or (num_classes,) for single sample
Type: torch.FloatTensor
Values: Raw logits (not probabilities). Apply softmax to get class probabilities.
Example shape: (32, 5) for batch of 32 videos with 5 output classes

Usage Example

Training

import torch
from train import MLP

# Initialize model
input_dim = 1024  # CLIP visual + text embeddings
num_classes = 5   # Number of TikTok folders
model = MLP(input_dim, num_classes, hidden_dim=256)

# Forward pass
batch = torch.randn(32, 1024)  # 32 videos
logits = model(batch)          # Shape: (32, 5)

Inference

import torch
import torch.nn.functional as F
from predict import MLP

# Load trained model
model = MLP(input_dim=1024, num_classes=5)
model.load_state_dict(torch.load("model.pt"))
model.eval()

# Predict
with torch.no_grad():
    features = torch.randn(10, 1024)  # 10 videos
    logits = model(features)
    probs = F.softmax(logits, dim=1)  # Convert to probabilities
    predictions = probs.argmax(dim=1)  # Get predicted class indices

Training Details

When training the MLP (see train.py:48-96):

Optimizer: Adam with learning rate 1e-3 and weight decay 1e-4
Loss function: CrossEntropyLoss with class weights to handle imbalanced datasets
Batch size: 32
Early stopping: Patience of 15 epochs based on validation accuracy
Device: Automatically uses CUDA if available, otherwise CPU

Model Persistence

The trained model is saved using PyTorch’s state dict:

# Save
torch.save(model.state_dict(), "artifacts/model.pt")

# Load
model = MLP(input_dim, num_classes, hidden_dim)
model.load_state_dict(torch.load("artifacts/model.pt"))

Model configuration is stored separately in model_config.json (see embeddings documentation).

Scripts

Models

Overview

Class Definition

Constructor Parameters

Architecture

Forward Pass

Usage Example

Training

Inference

Training Details

Model Persistence

Build docs developers (and LLMs) love

Scripts

Models

​Overview

​Class Definition

​Constructor Parameters

​Architecture

​Forward Pass

​Usage Example

​Training

​Inference

​Training Details

​Model Persistence

Build docs developers (and LLMs) love

Overview

Class Definition

Constructor Parameters

Architecture

Forward Pass

Usage Example

Training

Inference

Training Details

Model Persistence