Moderation

ChatBotKit comes with advanced content moderation features that are essential for maintaining the integrity and safety of bot-user interactions. By utilizing these features, developers can ensure that the content generated by and for their bots remains respectful, safe, and free from harmful language.

Features

Content Scanning: Once content moderation is enabled, all incoming and outgoing content will be meticulously scanned.
Language Detection: The system can recognize harmful, hateful, and other types of inappropriate language.
Automatic Refusal: If flagged content is detected, the bot will automatically refuse to respond, ensuring that harmful content doesn't get propagated.

Enabling Content Moderation

To enable content moderation for your bots and integrations:

Go to the Bot Advanced Settings.
Toggle the Moderation switch to ON.

Remember: Once enabled, all content – both incoming and outgoing – will be subject to content moderation. This ensures a comprehensive shield against potential harm.

How it Works

When a user sends a message to the bot, ChatBotKit will scan the content before processing.
If inappropriate language or content is detected, the message will be flagged.
The bot will not process flagged content and will instead send a default refusal message. This ensures that inappropriate prompts don't result in any undesired responses.
You view flagged content in your conversations.
You will receive an email notification if a message is flagged for moderation.

conversational ai moderation

AI Agents

AI Widgets

AI for Messaging Platforms

AI SDKs

Examples

Documentation

Tutorials

Changelog

Reflections

Basics

Moderation

Features

Enabling Content Moderation

How it Works

Privacy

Security

Playgrounds