Incident Management

Learn how to effectively communicate service disruptions to your customers through incidents.

Creating an Incident

When a service disruption occurs, navigate to Incidents and click Report Incident. You'll need to provide:

  • Title - A brief description of the issue (e.g., "API response times elevated")
  • Status - The current state of your investigation
  • Impact - How severely the issue affects users
  • Message - Detailed information about what's happening and what you're doing
  • Affected Components - Which services are impacted

When you create an incident, subscribers are automatically notified based on their notification preferences.

Incident Statuses

Incidents follow a structured workflow with four statuses:

Investigating

You're aware of an issue and actively looking into it. This is the initial status for most incidents.

Identified

You've found the root cause and are working on a fix. Customers know you understand the problem.

Monitoring

A fix has been implemented and you're watching to ensure the issue doesn't recur.

Resolved

The incident is fully resolved. Affected components return to their normal status.

Status Transitions

Statuses can only move forward in the workflow (Investigating → Identified → Monitoring → Resolved). The only exception is reopening a resolved incident, which returns it to Investigating.

Impact Levels

Impact levels help customers understand the severity of an incident:

LevelDescriptionExample
NoneInformational only, no user impactPlanned infrastructure changes
MinorSome users may experience issuesElevated latency, slow responses
MajorSignificant functionality unavailableAPI errors, feature outages
CriticalComplete service unavailabilityFull outage, data unavailable

Adding Timeline Updates

As you work on resolving an incident, add updates to keep customers informed. Each update includes:

  • New Status - Optionally change the incident status
  • Message - What's changed since the last update

Updates are displayed chronologically on your public status page, creating a timeline of the incident. Subscribers receive notifications for each update.

💡 Best Practice

Post updates every 30-60 minutes during active incidents, even if there's no new information. A simple "We're still investigating" reassures customers you're actively working on the issue.

Incident Templates

For common incident types, you can create templates to speed up incident creation. Templates pre-fill the title, impact level, message, and affected components.

When creating a new incident, select a template from the dropdown to apply it. You can still modify any fields before submitting.

Affected Components

Link incidents to the components they affect. This helps customers quickly see which services are impacted and allows subscribers to receive notifications only for components they care about.

You can add or remove affected components at any time during an incident by editing the incident details.