Skip to main content
AI auto-moderation runs every incoming message through Claude and scores it across three categories: toxicity, spam, and harassment. When a score crosses a configured threshold, the bot takes an action — from silently flagging the message for review all the way up to an immediate ban.
AI auto-moderation is disabled by default (enabled: false). Enable it only after reviewing your threshold and action settings.

How it works

1

Message received

Every non-bot message in the server is passed to the AI auto-mod pipeline. Messages shorter than 3 characters and messages from exempt roles are skipped.
2

Claude analyzes the message

The message is sent to Claude (claude-haiku-4-5 by default) with a structured prompt that asks for a score between 0.0 and 1.0 for each category — toxicity, spam, and harassment — plus a short explanation.
3

Thresholds checked

Each score is compared to its configured threshold. Categories whose scores meet or exceed the threshold are marked as triggered.
4

Action executed

The highest-priority action across all triggered categories is executed. Priority order (highest to lowest): bankicktimeoutwarn / deleteflag.
5

Flag embed posted

If a flagChannelId is configured, a review embed is posted to that channel showing the scores, reason, and a jump link to the original message — regardless of what action was taken.

Categories and thresholds

Thresholds accept a float between 0.0 (flag everything) and 1.0 (flag nothing). The defaults are intentionally conservative.
CategoryDefault thresholdWhat it targets
toxicity0.7Hate speech, slurs, extreme negativity targeting groups or individuals
spam0.8Repetitive content, ads, scam links, flooding
harassment0.7Targeted attacks, threats, bullying, doxxing
"aiAutoMod": {
  "thresholds": {
    "toxicity": 0.7,
    "spam": 0.8,
    "harassment": 0.7
  }
}

Actions

Configure an action for each category. When multiple categories are triggered, the highest-priority action wins.
ActionEffect
flagPost a review embed to flagChannelId only — no action against the user
deleteDelete the message
warnIssue a warning and create a mod case
timeoutTemporarily mute the user for timeoutDurationMs milliseconds
kickRemove the user from the server
banPermanently ban the user
"aiAutoMod": {
  "actions": {
    "toxicity": "flag",
    "spam": "delete",
    "harassment": "warn"
  }
}
When autoDelete is true (the default), the offending message is always deleted before any other action is taken — even for flag-only categories.

Exempt roles

Users who hold any role listed in exemptRoleIds are skipped entirely. Use this to protect moderators, bots, or trusted community roles from auto-mod interference.
"aiAutoMod": {
  "exemptRoleIds": ["1234567890", "0987654321"]
}

Flag channel

Set flagChannelId to the ID of a private moderation channel. Every flagged message posts an embed there with:
  • Author mention and tag
  • Source channel link
  • Triggered categories
  • Full message content (up to 1024 characters)
  • Visual score bars for all three categories
  • Claude’s reasoning
  • A jump link to the original message
"aiAutoMod": {
  "flagChannelId": "1234567890"
}

Full config reference

"aiAutoMod": {
  "enabled": false,
  "model": "claude-haiku-4-5",
  "thresholds": {
    "toxicity": 0.7,
    "spam": 0.8,
    "harassment": 0.7
  },
  "actions": {
    "toxicity": "flag",
    "spam": "delete",
    "harassment": "warn"
  },
  "timeoutDurationMs": 300000,
  "flagChannelId": null,
  "autoDelete": true,
  "exemptRoleIds": []
}
Config keyDefaultDescription
enabledfalseEnable or disable AI auto-moderation
modelclaude-haiku-4-5Claude model used for analysis
thresholds.toxicity0.7Score threshold for toxicity
thresholds.spam0.8Score threshold for spam
thresholds.harassment0.7Score threshold for harassment
actions.toxicityflagAction when toxicity threshold is met
actions.spamdeleteAction when spam threshold is met
actions.harassmentwarnAction when harassment threshold is met
timeoutDurationMs300000Timeout duration in ms (default: 5 minutes)
flagChannelIdnullChannel ID to post review embeds
autoDeletetrueAlways delete the flagged message
exemptRoleIds[]Role IDs exempt from auto-mod checks

Build docs developers (and LLMs) love