Overview
The emotion prediction system is built as a Flask microservice that combines multiple machine learning models to analyze text for toxicity classification and emotion detection. The architecture consists of three main components:- Flask Application Layer - HTTP API server
- PyTorch Neural Network - Deep learning model for toxicity classification
- Scikit-learn Classifier - Traditional ML model for emotion prediction
Application Stack
The microservice runs on:Core Dependencies
- Flask 2.2.2 - Web framework for HTTP endpoints
- PyTorch 1.13.0 - Deep learning framework
- scikit-learn 1.0.1 - Machine learning library
- NLTK 3.7 - Natural language processing toolkit
The application is configured with
JSON_AS_ASCII = False to properly handle Unicode characters in text responses.Model Components
1. PyTorch Deep Neural Network
Thebase_line model is a multi-layer perceptron that classifies text into 6 toxicity categories:
- Toxic
- Severe Toxic
- Obscene
- Threat
- Insult
- Identity Hate
2. Word Embeddings
A 10-dimensional embedding layer converts words to numerical vectors:3. Emotion Classifier
A pre-trained scikit-learn model loaded from pickle:4. Text Vectorizer
Converts preprocessed text into feature vectors for the emotion classifier:Processing Pipeline
The system processes incoming text through the following stages:Request Flow
- Input: GET request to
/textbased_emotion?text=<input_text> - Preprocessing: Text cleaning and lemmatization
- Parallel Processing:
- PyTorch model analyzes toxicity (6 binary classifications)
- Scikit-learn model predicts emotion
- Output: HTML template with all predictions
The threshold for toxicity classification is set at 0.29. Scores above this value indicate the presence of that toxicity type.
Model Files
The application requires the following model artifacts:| File | Purpose | Format |
|---|---|---|
model_26_87.12.pth | Neural network weights | PyTorch state dict |
emotion_classifier.model | Emotion classifier | Pickle |
vectorizer2.pickle | Text vectorizer | Pickle |
word_dict.json | Word-to-index mapping | JSON |
The model filename
model_26_87.12.pth suggests this is epoch 26 with 87.12% accuracy.Additional Features
Beyond emotion prediction, the system extracts:- Dates - Various date formats using regex patterns
- Countries - Using pycountry library
- People Names - Using NLTK POS tagging (NNP tags)
- Time Ranges - Hour ranges in format “HH:MM - HH:MM”