Overview
Content moderation helps you:- Filter inappropriate or harmful content
- Comply with content policies
- Protect users from offensive material
- Maintain platform guidelines
OpenAI Moderation
Use OpenAI’s built-in moderation API
Custom Moderation
Implement your own moderation rules
User Reports
Handle user-reported content
Ban Management
Manage banned users and content
OpenAI Moderation
LibreChat can use OpenAI’s Moderation API to automatically flag problematic content.Configuration
Enable moderation in your.env file:
Enable OpenAI content moderation
API key for moderation (uses OPENAI_API_KEY if not set)
Optional reverse proxy URL for moderation API
Example Configuration
.env
How It Works
Flag Detection
API checks for:
- Sexual content
- Hate speech
- Harassment
- Self-harm
- Violence
- Illegal activities
Moderation Categories
OpenAI’s moderation API checks for these categories:Sexual Content
Sexual Content
Content meant to arouse sexual excitement, including:
- Explicit sexual descriptions
- Sexual acts
- Adult content
Hate Speech
Hate Speech
Content that expresses, incites, or promotes hate based on:
- Race
- Gender
- Ethnicity
- Religion
- Nationality
- Sexual orientation
- Disability
Harassment
Harassment
Content that promotes harassment or bullying of individuals or groups.
Self-Harm
Self-Harm
Content that promotes, encourages, or depicts acts of self-harm, including:
- Suicide
- Cutting
- Eating disorders
Violence
Violence
Content that depicts or glorifies violence or celebrates suffering/humiliation.
Violence/Graphic
Violence/Graphic
Violent content in graphic detail, including gore and death.
Custom Moderation
Implement your own moderation logic by extending LibreChat’s moderation middleware.Custom Moderation Rules
Create custom rules inapi/server/middleware/moderateContent.js:
api/server/middleware/moderateContent.js
Implementing Custom Checks
User Reports
Allow users to report inappropriate content for review.Report Types
- Message Reports
- User Reports
- Agent Reports
Users can report individual messages that violate policies:
- Click report icon on message
- Select violation category
- Add optional description
- Submit report
Review Queue
Moderators can review reports in the admin dashboard:- View Reports: See all pending reports
- Review Content: Examine flagged content and context
- Take Action:
- Dismiss report
- Delete content
- Warn user
- Ban user
- Document Decision: Add notes about action taken
Ban Management
Manage users who violate moderation policies.Temporary Bans
Set a ban with expiration:Temporary bans are useful for first-time or minor violations. Users are automatically unbanned after the duration expires.
Permanent Bans
Permanently ban a user:View Banned Users
List all currently banned users:Unban Users
Remove a ban:Moderation Logging
All moderation actions are logged for audit purposes.Log Location
Moderation logs are stored in:logs/moderation.log- All moderation eventslogs/moderation-errors.log- Moderation system errors
Log Format
Best Practices
Layer Your Defenses
Use both automated moderation (OpenAI API) and custom rules for comprehensive coverage.
Start Conservative
Begin with strict moderation and relax rules based on your community’s needs and maturity.
Moderation Workflow
Troubleshooting
Moderation API rate limits exceeded
Moderation API rate limits exceeded
- Use a dedicated moderation API key
- Implement request queuing
- Cache moderation results for similar content
- Consider using a reverse proxy
Too many false positives
Too many false positives
- Adjust moderation thresholds
- Add whitelisted terms or patterns
- Review and update custom rules
- Implement appeal process for users
Inappropriate content getting through
Inappropriate content getting through
- Enable stricter moderation levels
- Add custom rules for specific issues
- Implement multi-layer moderation
- Review and update blocked terms list
Moderation slowing down responses
Moderation slowing down responses
- Use async moderation for non-critical content
- Cache moderation results
- Optimize custom rule checking
- Consider post-moderation for trusted users
Related Documentation
- User Management - Ban and manage users
- Configuration - Moderation settings
- Monitoring - Track moderation metrics