Proactive Slack Incident Responder
A dual-agent architecture where a Monitor Agent detects incidents and proactively initiates Slack DMs with on-call engineers, while a Response Agent handles the ongoing conversation, provides context, and coordinates resolution.
The Proactive Slack Incident Responder demonstrates how AI agents can autonomously detect problems and reach out to the right people at the right time. Instead of waiting for engineers to notice alerts or check dashboards, this system brings critical information directly to them via Slack DM—and then helps them through the resolution process.
The Monitor Agent is the watchful eye. It can be triggered by webhooks from monitoring systems, scheduled to check metrics, or invoked when anomalies are detected. When it identifies an incident that needs human attention, it uses the slack/conversation/start ability to initiate a DM with the appropriate on-call engineer. Crucially, it passes rich context about the incident—what's happening, what's affected, relevant metrics, and suggested first steps.
The Response Agent is the helpful partner. It's connected directly to the Slack integration and handles all replies from the engineer. When someone responds to an incident DM, this agent takes over. It has access to the context provided by the Monitor (stored as an activity in the conversation) and can help the engineer investigate, provide additional information, run diagnostics, or coordinate with other systems.
Traditional alerting is one-way: a system fires an alert, and humans must figure out what to do. This architecture creates two-way conversations that start at the moment of detection.
- It ensures the right person knows immediately. The Monitor can look up on-call schedules, understand incident severity, and route to the appropriate engineer—no alert fatigue from broadcast channels.
- It provides context upfront. Instead of an engineer getting a cryptic alert and then spending 10 minutes gathering context, the DM arrives with everything they need to start investigating immediately.
- It offers ongoing assistance. The Response Agent can answer questions, run additional checks, fetch logs, or even execute runbook steps—all within the same Slack thread where the incident was reported.
This blueprint showcases the slack/conversation/start ability, which enables agents to initiate conversations rather than just respond to them. The pattern works like this: The Monitor Agent detects an incident, determines the right person to notify, crafts a contextual message, and uses slack/conversation/start with the engineer's channel ID. The DM arrives in Slack, starting a new conversation. When the engineer replies, the Response Agent takes over and the conversation continues until resolution. The context parameter transfers knowledge from the Monitor to the Response Agent, ensuring continuity across the handoff.
Use cases for this pattern include:
- Incident Response: Detect outages, performance issues, or errors and immediately reach out to on-call engineers with full context.
- Anomaly Investigation: When metrics deviate from normal, proactively engage the relevant team member to investigate before it becomes critical.
- Deployment Monitoring: After deployments, watch for issues and notify the deploying engineer if problems arise—they have the most context.
- Security Alerts: When suspicious activity is detected, immediately engage the security team with relevant details and investigation tools.
- SLA Management: When approaching SLA thresholds, proactively alert the account owner and offer assistance in resolving the issue.
To extend this blueprint, add abilities to the Response Agent for running diagnostics, fetching logs, checking related services, or executing runbook steps. Integrate with your incident management system to automatically create tickets and track resolution. Connect to your on-call scheduler to always route to the current on-call engineer. You can also add escalation logic—if the first engineer doesn't respond within a timeout, the Monitor can initiate a new conversation with their backup.
Backstory
Common information about the bot's experience, skills and personality. For more information, see the Backstory documentation.
Skillset
This example uses a dedicated Skillset. Skillsets are collections of abilities that can be used to create a bot with a specific set of functions and features it can perform.
Start Slack DM
Initiates a new Slack DM conversation with an engineerSearch for Solutions
Search the web for known issues and solutionsFetch Documentation
Fetch runbook or documentation contentSearch Solutions
Search for known issues and solutionsFetch Documentation
Fetch runbook or documentation contentRun Command
Execute shell commands for diagnostics and investigationRead/Write Files
Read or write files in the diagnostics workspace
Terraform Code
This blueprint can be deployed using Terraform, enabling infrastructure-as-code management of your ChatBotKit resources. Use the code below to recreate this example in your own environment.
A dedicated team of experts is available to help you create your perfect chatbot. Reach out via or chat for more information.