Allma

Sign in

Scott Klein’s Incident Communications Template

Get started with example updates to both internal and external audiences during a technical incident

By Scott Klein

Principles

  1. Ensure you have a dedicated communications person during an incident.
  2. Be truthful and straightforward about what's going on. Bad information is probably going to be better received than no information at all.
  3. Encourage your communications person to err on the side of more frequent updates, even if there's no new information to be had.
  4. Don't over promise. Give an accurate, directionally conservative view of what's going on.

Internal update

General guidance

Synthesize what is happening, who is working on it, and when the team can expect the next update. You likely won’t know all the details, especially at first—that’s okay. Acknowledge that you don’t know everything but that you focused on understanding and problem-solving.

Example

We have detected an issue with [THIS SYSTEM/FEATURE]. The impact is [THIS EFFECT] to [THIS MANY OF OUR USERS]. [THIS NUMBER OF] participants across [THIS NUMBER OF] time zones are working on this issue. We don’t know everything, but we do know the following [INFORMATION] and we know what we don’t know: [THIS INFORMATION]. The next update on progress will be in [THIS AMOUNT OF TIME].

External update

General guidance: acknowledge awareness of the issue, describe at a high-level what is happening, and when customers can expect the next update. To the extent possible, reduce customer concerns. You likely won’t know all the details, especially at first, and that’s okay.

Status: Investigating

  • We are investigating an issue with our [THIS SYSTEM/FEATURE]. The impact is [THIS EFFECT] to [THIS MANY OF OUR USERS]. The team is on top of things and will update you on progress in [THIS AMOUNT OF TIME].
  • We are continuing to look into this issue and will update you in [THIS AMOUNT OF TIME].

Status: Identified

  • A fix is in progress. Next update in [THIS AMOUNT OF TIME].

Status: Mitigated

  • A fix has been implemented, which we are continuing to monitor. Next update in [THIS AMOUNT OF TIME].

Status: Resolved

  • The issue has been resolved, we apologize for the inconvenience and are working to understand the underlying behaviors that caused the issue.
  • On [THIS DATE] [THIS SYSTEM/FEATURE] experienced an issue. We apologize for the inconvenience. We are evolving our systems accordingly, and want to share what we’ve learned from this issue: [INSERT FINDINGS HERE].
  • Starting [THIS DATE] and resolving on [THIS DATE], [THIS SYSTEM/FEATURE] experienced [THIS ISSUE] for [THIS AMOUNT OF TIME]. Below we’re sharing technical details to give our customers and community a view into the underlying behaviors that caused the problem, how we mitigated and resolved the problem, and our path forward to mitigate the probability of similar issues from happening in the future.

See Scott Klein’s Incident Management Workflow and Values Template for more

Scott Klein

Head of product, Levels Health

Scott Klein is head of product at Levels Health. Formerly, he was founder and CEO of statuspage.io, a Y Combinator-backed startup later acquired by Atlassian.

Continue the conversation

join the Allma Discord community

incident
management
collaboration.

Allma– UI-less Incident Collaboration. Natively in Slack.

Get early access

Continue reading

6 key lessons on recruiting from our conversation with Rich ParetIncidentally: An Interview with Ian Marlier
view all posts

our newsletter is cool

allma, inc © 2022