Overview
The project includes two primary notebooks for model training:- emotion_prediction.ipynb - Trains the basic emotion classifier
- labels_clasifier__textbased_emotion_prediction.ipynb - Trains the multi-label toxicity classifier
Prerequisites
Before training models, ensure you have:- Python 3.x installed
- Virtual environment set up
- All dependencies installed from
requirements.txt
Open and run
emotion_prediction.ipynb to train the basic emotion classification model. This notebook creates the emotion_classifier.model file.Open and run
labels_clasifier__textbased_emotion_prediction.ipynb to train the neural network model that classifies text into 6 toxicity categories:Model Artifacts
After training, the following files are generated in themodels/ directory:
| File | Description |
|---|---|
emotion_classifier.model | Pickle file containing the trained emotion classifier |
model_26_87.12.pth | PyTorch weights for the multi-label toxicity classifier (87.12% accuracy) |
vectorizer2.pickle | Text vectorizer for feature extraction |
word_dict.json | Word-to-index mapping for embeddings |
train_tensor.pt | Training data tensors |
train_labels.pt | Training labels |
test_tensor.pt | Test data tensors |
test_labels.pt | Test labels |
Model Architecture
The multi-label classifier uses a neural network with the following structure:- 10-dimensional word embeddings
- 4 fully connected layers (2048 → 1024 → 512 → 6)
- ReLU activation
- Sigmoid output for multi-label classification