Skip to main content

Overview

The emotion prediction API provides 6 distinct toxicity categories to help you identify and moderate inappropriate content. Each category uses a threshold of 0.29 to determine if content should be flagged.

Toxicity Categories

The API classifies text across six categories:

Toxic

General toxic language or rude comments

Severe Toxic

Extremely toxic content with strong offensive language

Obscene

Profanity, vulgar, or sexually explicit content

Threat

Threatening language or intimidation

Insult

Personal insults or attacks

Identity Hate

Hate speech targeting identity groups

How It Works

The API analyzes text and returns predictions for each category. Content is flagged when the model’s confidence exceeds the 0.29 threshold (see microservice.py:206-211).

Inappropriate Content Detection

Content is marked as inappropriate when ALL six toxicity categories are flagged simultaneously (microservice.py:219):
inappropriate = its_toxic and its_severe_toxic and its_obscene and its_threat and its_insult and its_identity_hate

API Usage

Basic Request

curl "http://127.0.0.1:3200/textbased_emotion?text=Your%20text%20here"

Example Response

The API returns classification results for each category:
{
  "toxic_result": "Yes",
  "severe_toxic_result": "No",
  "obscene_result": "Yes",
  "threat_result": "No",
  "insult_result": "Yes",
  "identity_hate_result": "No",
  "inappropriate": false
}

Use Case Scenarios

Social Media Platform

Filter toxic comments in real-time before they appear publicly. Flag content for human review when multiple categories are triggered.

Community Forums

Automatically moderate forum posts and threads. Create different moderation levels based on which toxicity categories are detected.

Customer Support

Monitor customer messages to identify abusive language directed at support staff. Escalate severe cases automatically.

Content Publishing

Screen user submissions before publication. Maintain brand safety by preventing inappropriate content from going live.

Implementation Example

Automated Moderation System

import requests

def moderate_content(user_text):
    # Call the emotion prediction API
    url = f"http://127.0.0.1:3200/textbased_emotion?text={user_text}"
    response = requests.get(url)
    
    # Parse results
    data = response.json()
    
    # Define moderation actions based on toxicity
    if data['inappropriate']:
        return "BLOCK", "Content violates all toxicity guidelines"
    
    # Check individual categories for specific actions
    high_risk_categories = [
        data['severe_toxic_result'],
        data['threat_result'],
        data['identity_hate_result']
    ]
    
    if 'Yes' in high_risk_categories:
        return "REVIEW", "High-risk content requires human review"
    
    medium_risk_categories = [
        data['toxic_result'],
        data['obscene_result'],
        data['insult_result']
    ]
    
    if 'Yes' in medium_risk_categories:
        return "FLAG", "Content flagged for potential issues"
    
    return "APPROVE", "Content is clean"

# Example usage
action, reason = moderate_content("This is a normal friendly message")
print(f"Action: {action}, Reason: {reason}")
# Output: Action: APPROVE, Reason: Content is clean

Multi-tier Moderation

def get_moderation_tier(results):
    """
    Assign moderation tier based on toxicity categories
    """
    # Tier 1: Immediate block (all categories flagged)
    if results['inappropriate']:
        return 1, "Immediate block - all categories flagged"
    
    # Tier 2: Severe content (3+ categories)
    flagged_count = sum([
        results['toxic_result'] == 'Yes',
        results['severe_toxic_result'] == 'Yes',
        results['obscene_result'] == 'Yes',
        results['threat_result'] == 'Yes',
        results['insult_result'] == 'Yes',
        results['identity_hate_result'] == 'Yes'
    ])
    
    if flagged_count >= 3:
        return 2, "High risk - multiple categories flagged"
    
    # Tier 3: Moderate content (1-2 categories)
    if flagged_count >= 1:
        return 3, "Moderate risk - requires review"
    
    # Tier 4: Clean content
    return 4, "Clean content"

Best Practices

Moderation Guidelines

  • Combine with human review: Use the API for initial filtering, but have humans review edge cases
  • Set appropriate thresholds: The default 0.29 threshold works well, but adjust based on your use case
  • Monitor false positives: Track incorrectly flagged content to improve your moderation workflow
  • Provide user feedback: Let users know why content was flagged and give them a chance to edit
  • Log all decisions: Keep records of moderation actions for transparency and appeals

Next Steps

  • Learn how to use Sentiment Analysis for positive/negative emotion detection
  • Explore the API’s entity extraction features for dates, countries, and people names

Build docs developers (and LLMs) love