Sign in

More Resources: Escaped defects

Things to read and watch alongside our conversation with Liz Fong-Jones

By Joshua Timberman

“Overall, my message is: don’t be afraid of defects, embrace that they’re going to happen, and instead of trying to decrease the number that happen, decrease how severe it is when they do happen.” — Liz Fong-Jones

This week on the Allma blog, we featured key lessons from our recent conversation with Liz-Fong Jones about escaped defects, surfacing customer-reported bugs back into product development teams, scoping processes, and more.

Alongside those highlights, we’ve put together this list of additional resources on escaped defects and testing in production:

How to Measure Defect Escape Rate to Keep Bugs out of Production” by Matt Watson at DevOps Zone
This article dives into why tracking escaped defects is important, and how knowing the defect escape rate will help organizations ship better software and services.

I test in prod” by Charity Majors at Increment
⁠Honeycomb’s CTO and co-founder, Charity Majors, has long been a proponent of “testing in production.” In this article, she goes into depth about the reasons why we’re all testing in production anyway.

Defect Resolution SLAs” by Skip Dragoo
⁠Software as a Service products have SLAs, and in order to understand whether we're upholding our promise in the SLAs, we use SLOs to measure the goals defined in those SLAs. In this article, Skip talks about how managing defects and tracking when they occur can help organizations reduce the time it takes to discover defects in the future.

Escaped Defects” by Tom Churchwell at Leading Agile
When problems occur in products and production systems, organizations can face a messy scramble trying to get things fixed. It’s critical to discover these problems before they get in the hands of customers, and Tom outlines the organizational costs of escaped defects—and what to know so you can handle them in the future.

Ensuring Reliability with SLOs and Datadog & Google Cloud” by Meghan Jordan and Nathen Harvey
From the webinar description: Uptime is a poor measure of reliability. Agile development’s fail-fast approach coupled with distributed applications and dynamic infrastructure requires us to have a better understanding of reliability. Service level objectives (SLOs) help you understand the true health of your systems and how your end users experience them. Poorly defined SLOs means you have little to no visibility into the successes and failures of those apps and their services.

Joshua Timberman

Head of Advocacy and Community, Allma

Joshua Timberman is an advocate for humans and a community builder. As a system administrator and technical operations engineer for over 20 years, Joshua has worked with and broken computers in every conceivable form and iteration. He's used that experience to develop and run incident management, operations teams, and help improve the working lives of humans who use computers for day to day work. Prior to Allma, Joshua spent almost 13 years building Chef and the Chef community.

Continue the conversation

join the Allma Discord community


Allma– UI-less Incident Collaboration. Natively in Slack.

Get early access

Continue reading

6 key lessons on recruiting from our conversation with Rich ParetIncidentally: An Interview with Ian Marlier
view all posts

our newsletter is cool

allma, inc © 2022