Scott Klein’s Incident Communications Template
Get started with example updates to both internal and external audiences during a technical incident
By Scott Klein
Principles
- Ensure you have a dedicated communications person during an incident.
- Be truthful and straightforward about what's going on. Bad information is probably going to be better received than no information at all.
- Encourage your communications person to err on the side of more frequent updates, even if there's no new information to be had.
- Don't over promise. Give an accurate, directionally conservative view of what's going on.
Internal update
General guidance
Synthesize what is happening, who is working on it, and when the team can expect the next update. You likely won’t know all the details, especially at first—that’s okay. Acknowledge that you don’t know everything but that you focused on understanding and problem-solving.
Example
We have detected an issue with [THIS SYSTEM/FEATURE]. The impact is [THIS EFFECT] to [THIS MANY OF OUR USERS]. [THIS NUMBER OF] participants across [THIS NUMBER OF] time zones are working on this issue. We don’t know everything, but we do know the following [INFORMATION] and we know what we don’t know: [THIS INFORMATION]. The next update on progress will be in [THIS AMOUNT OF TIME].
External update
General guidance: acknowledge awareness of the issue, describe at a high-level what is happening, and when customers can expect the next update. To the extent possible, reduce customer concerns. You likely won’t know all the details, especially at first, and that’s okay.
Status: Investigating
- We are investigating an issue with our [THIS SYSTEM/FEATURE]. The impact is [THIS EFFECT] to [THIS MANY OF OUR USERS]. The team is on top of things and will update you on progress in [THIS AMOUNT OF TIME].
- We are continuing to look into this issue and will update you in [THIS AMOUNT OF TIME].
Status: Identified
- A fix is in progress. Next update in [THIS AMOUNT OF TIME].
Status: Mitigated
- A fix has been implemented, which we are continuing to monitor. Next update in [THIS AMOUNT OF TIME].
Status: Resolved
- The issue has been resolved, we apologize for the inconvenience and are working to understand the underlying behaviors that caused the issue.
- On [THIS DATE] [THIS SYSTEM/FEATURE] experienced an issue. We apologize for the inconvenience. We are evolving our systems accordingly, and want to share what we’ve learned from this issue: [INSERT FINDINGS HERE].
- Starting [THIS DATE] and resolving on [THIS DATE], [THIS SYSTEM/FEATURE] experienced [THIS ISSUE] for [THIS AMOUNT OF TIME]. Below we’re sharing technical details to give our customers and community a view into the underlying behaviors that caused the problem, how we mitigated and resolved the problem, and our path forward to mitigate the probability of similar issues from happening in the future.
See Scott Klein’s Incident Management Workflow and Values Template for more