Skip to main content
GemAI uses Google’s safety settings to filter harmful content in AI responses. You can customize these settings to balance safety and flexibility based on your app’s requirements.

Default safety settings

By default, GemAI configures minimal content filtering:
ModelBuilder.kt
private var safetySettings = listOf(
    SafetySetting(HarmCategory.SEXUALLY_EXPLICIT, BlockThreshold.NONE)
)
The default setting uses BlockThreshold.NONE for sexually explicit content, which means no filtering. You should configure appropriate safety settings for your use case.

Configuring safety settings

Use the setSafetySettings() method to customize content filtering:
ModelBuilder.kt
fun setSafetySettings(settings: List<SafetySetting>) = apply { 
    safetySettings = settings 
}

Example: Comprehensive safety configuration

import com.google.ai.client.generativeai.type.HarmCategory
import com.google.ai.client.generativeai.type.BlockThreshold
import com.google.ai.client.generativeai.type.SafetySetting

val safeModel = ModelBuilder.Builder()
    .setApiKey(apiKey)
    .setModel(AIModel.GEMINI_1_5_FLASH.modelName)
    .setSafetySettings(listOf(
        SafetySetting(HarmCategory.HARASSMENT, BlockThreshold.MEDIUM_AND_ABOVE),
        SafetySetting(HarmCategory.HATE_SPEECH, BlockThreshold.MEDIUM_AND_ABOVE),
        SafetySetting(HarmCategory.SEXUALLY_EXPLICIT, BlockThreshold.MEDIUM_AND_ABOVE),
        SafetySetting(HarmCategory.DANGEROUS_CONTENT, BlockThreshold.MEDIUM_AND_ABOVE)
    ))
    .build()

Harm categories

Gemini API provides four harm categories for content classification:
HarmCategory.HARASSMENT
HarmCategory
Content that is rude, disrespectful, or profane. Includes bullying, intimidation, and targeted harassment.
HarmCategory.HATE_SPEECH
HarmCategory
Content that promotes or incites hatred against individuals or groups based on protected attributes (race, religion, gender, etc.).
HarmCategory.SEXUALLY_EXPLICIT
HarmCategory
Content that contains explicit sexual references or descriptions. Includes pornographic content.
HarmCategory.DANGEROUS_CONTENT
HarmCategory
Content that promotes, facilitates, or encourages harmful acts. Includes violence, self-harm, and illegal activities.

Block thresholds

Each harm category can be assigned a blocking threshold:
BlockThreshold.NONE
BlockThreshold
Allow all content regardless of harm probability. No filtering applied.
BlockThreshold.LOW_AND_ABOVE
BlockThreshold
Block content with low, medium, or high probability of harm. Most restrictive setting.
BlockThreshold.MEDIUM_AND_ABOVE
BlockThreshold
Block content with medium or high probability of harm. Recommended for most applications.
BlockThreshold.ONLY_HIGH
BlockThreshold
Only block content with high probability of harm. More permissive, suitable for mature audiences.

Safety setting examples

val familyFriendlySettings = listOf(
    SafetySetting(HarmCategory.HARASSMENT, BlockThreshold.LOW_AND_ABOVE),
    SafetySetting(HarmCategory.HATE_SPEECH, BlockThreshold.LOW_AND_ABOVE),
    SafetySetting(HarmCategory.SEXUALLY_EXPLICIT, BlockThreshold.LOW_AND_ABOVE),
    SafetySetting(HarmCategory.DANGEROUS_CONTENT, BlockThreshold.LOW_AND_ABOVE)
)

val model = ModelBuilder.Builder()
    .setApiKey(apiKey)
    .setModel(AIModel.GEMINI_1_5_FLASH.modelName)
    .setSafetySettings(familyFriendlySettings)
    .build()

How safety settings work

Safety settings are passed to the GenerativeModel during initialization:
ModelBuilder.kt
fun build(): GenerativeModel {
    require(apiKey.isNotBlank()) { "API key must not be blank" }
    require(modelName.isNotBlank()) { "Model name must not be blank" }
    return GenerativeModel(
        apiKey = apiKey,
        modelName = modelName,
        generationConfig = generationConfig,
        safetySettings = safetySettings,
        systemInstruction = getSystemInstruction(),
    )
}
The Gemini API evaluates each response against your configured safety settings. If content exceeds the threshold for any harm category, the API blocks the response and returns a safety rating instead.

Handling blocked content

When content is blocked, the Gemini API returns:
  • A safety rating indicating which category triggered the block
  • The probability level (low, medium, high) for each harm category
  • No generated content
You should implement error handling to gracefully inform users when content is blocked, without revealing the specific harm category.

Choosing the right settings

Use this guide to select appropriate safety settings:
Application typeRecommended thresholdRationale
Kids’ appsLOW_AND_ABOVEMaximum protection for young users
General audienceMEDIUM_AND_ABOVEBalanced safety for most users
Educational toolsMEDIUM_AND_ABOVEProtect students while allowing academic discussions
Professional toolsONLY_HIGHAllow technical discussions while blocking clearly harmful content
Research/analysisNONE (with warnings)Minimal filtering for analyzing sensitive topics
Using BlockThreshold.NONE means no content filtering. Only use this setting if your app has appropriate user warnings and is intended for mature, consenting audiences.

Best practices

1

Start conservative

Begin with stricter settings (MEDIUM_AND_ABOVE) and adjust based on user feedback and your app’s needs.
2

Test thoroughly

Test with various prompts, including edge cases, to understand how your settings affect responses.
3

Provide user controls

Consider allowing users to adjust safety settings (within acceptable ranges) based on their preferences.
4

Monitor and iterate

Track blocked content and user reports to refine your safety configuration over time.

Compliance considerations

Safety settings help protect users but don’t guarantee complete filtering. Consider these additional measures:
  • Age verification for apps targeting specific age groups
  • Terms of service that outline acceptable use
  • User reporting mechanisms for problematic content
  • Human moderation for high-risk applications
  • Legal compliance with regulations like COPPA, GDPR, etc.

Troubleshooting

Too much content is blocked

If legitimate responses are being blocked:
  • Lower the threshold to MEDIUM_AND_ABOVE or ONLY_HIGH
  • Review your system prompts - they might be eliciting flagged content
  • Test with different models (Pro models may handle edge cases better)

Harmful content is getting through

If inappropriate content appears:
  • Increase the threshold to LOW_AND_ABOVE
  • Add explicit safety instructions to your system prompts
  • Implement additional client-side filtering
  • Report issues to Google’s safety team

Build docs developers (and LLMs) love