AI auto-moderation

AI auto-moderation runs every incoming message through Claude and scores it across three categories: toxicity, spam, and harassment. When a score crosses a configured threshold, the bot takes an action — from silently flagging the message for review all the way up to an immediate ban.

AI auto-moderation is disabled by default (enabled: false). Enable it only after reviewing your threshold and action settings.

How it works

Message received

Every non-bot message in the server is passed to the AI auto-mod pipeline. Messages shorter than 3 characters and messages from exempt roles are skipped.

Claude analyzes the message

The message is sent to Claude (claude-haiku-4-5 by default) with a structured prompt that asks for a score between 0.0 and 1.0 for each category — toxicity, spam, and harassment — plus a short explanation.

Thresholds checked

Each score is compared to its configured threshold. Categories whose scores meet or exceed the threshold are marked as triggered.

Action executed

The highest-priority action across all triggered categories is executed. Priority order (highest to lowest): ban → kick → timeout → warn / delete → flag.

Flag embed posted

If a flagChannelId is configured, a review embed is posted to that channel showing the scores, reason, and a jump link to the original message — regardless of what action was taken.

Categories and thresholds

Thresholds accept a float between 0.0 (flag everything) and 1.0 (flag nothing). The defaults are intentionally conservative.

Category	Default threshold	What it targets
`toxicity`	`0.7`	Hate speech, slurs, extreme negativity targeting groups or individuals
`spam`	`0.8`	Repetitive content, ads, scam links, flooding
`harassment`	`0.7`	Targeted attacks, threats, bullying, doxxing

"aiAutoMod": {
  "thresholds": {
    "toxicity": 0.7,
    "spam": 0.8,
    "harassment": 0.7
  }
}

Actions

Configure an action for each category. When multiple categories are triggered, the highest-priority action wins.

Action	Effect
`flag`	Post a review embed to `flagChannelId` only — no action against the user
`delete`	Delete the message
`warn`	Issue a warning and create a mod case
`timeout`	Temporarily mute the user for `timeoutDurationMs` milliseconds
`kick`	Remove the user from the server
`ban`	Permanently ban the user

"aiAutoMod": {
  "actions": {
    "toxicity": "flag",
    "spam": "delete",
    "harassment": "warn"
  }
}

When autoDelete is true (the default), the offending message is always deleted before any other action is taken — even for flag-only categories.

Exempt roles

Users who hold any role listed in exemptRoleIds are skipped entirely. Use this to protect moderators, bots, or trusted community roles from auto-mod interference.

"aiAutoMod": {
  "exemptRoleIds": ["1234567890", "0987654321"]
}

Flag channel

Set flagChannelId to the ID of a private moderation channel. Every flagged message posts an embed there with:

Author mention and tag
Source channel link
Triggered categories
Full message content (up to 1024 characters)
Visual score bars for all three categories
Claude’s reasoning
A jump link to the original message

"aiAutoMod": {
  "flagChannelId": "1234567890"
}

Full config reference

"aiAutoMod": {
  "enabled": false,
  "model": "claude-haiku-4-5",
  "thresholds": {
    "toxicity": 0.7,
    "spam": 0.8,
    "harassment": 0.7
  },
  "actions": {
    "toxicity": "flag",
    "spam": "delete",
    "harassment": "warn"
  },
  "timeoutDurationMs": 300000,
  "flagChannelId": null,
  "autoDelete": true,
  "exemptRoleIds": []
}

Config key	Default	Description
`enabled`	`false`	Enable or disable AI auto-moderation
`model`	`claude-haiku-4-5`	Claude model used for analysis
`thresholds.toxicity`	`0.7`	Score threshold for toxicity
`thresholds.spam`	`0.8`	Score threshold for spam
`thresholds.harassment`	`0.7`	Score threshold for harassment
`actions.toxicity`	`flag`	Action when toxicity threshold is met
`actions.spam`	`delete`	Action when spam threshold is met
`actions.harassment`	`warn`	Action when harassment threshold is met
`timeoutDurationMs`	`300000`	Timeout duration in ms (default: 5 minutes)
`flagChannelId`	`null`	Channel ID to post review embeds
`autoDelete`	`true`	Always delete the flagged message
`exemptRoleIds`	`[]`	Role IDs exempt from auto-mod checks

Get Started

Configuration

Features

Dashboard

Commands

Self-Hosting

How it works

Categories and thresholds

Actions

Exempt roles

Flag channel

Full config reference

Build docs developers (and LLMs) love

Get Started

Configuration

Features

Dashboard

Commands

Self-Hosting

​How it works

​Categories and thresholds

​Actions

​Exempt roles

​Flag channel

​Full config reference

Build docs developers (and LLMs) love

How it works

Categories and thresholds

Actions

Exempt roles

Flag channel

Full config reference