Change failure rate

Change failure rate measures the percentage of deployed changes that cause their target environments to end up in a state of failure. Along with MTTR, Change failure rate is a measure of the quality, or stability of your software delivery capability.

"Failure" is defined differently for different organizations (and even within an organization), and Sleuth allows you capture your own unique definition of failure for each project you manage in Sleuth (see Setting up Change failure rate below for additional information on capturing your organization's unique definition of "failure" within Sleuth). At a high level, Sleuth evaluates Change failure rate by evaluating the specific Impact Source integrations you've set up for a given project and then calculates Change failure rate by dividing the number of deploys that were within your change failure sensitivity by the total number of deploys in the period.

For example, if you've setup the PagerDuty integration as an impact source and your team has one incident during the report period that spanned two deploys and you made a total of 20 deploys in that period, your change failure rate will be: 2 / 20 = 10%.

For more on how Sleuth measures Change failure rate and for best practices for determining what failure means to you, check out Sleuth CTO, Don Brown, explaining it in detail in this SleuthTV episode!

Change failure breakdowns

Sleuth's Project Metrics and Team Metrics dashboards show the total number of deploys that were deemed a failure in the period and also provide detailed breakdowns of deploys by type of failure. Failure types currently supported in Sleuth are:

Incidents - any deploy with a status of Incident - Sleuth provides integrations with PagerDuty, Statuspage, and many more, and we're continuously adding new integrations per customer demand. See Integrations for an up-to-date list of those we currently support.
Rolled back - any code deploys that were detected to be rolled back
Unhealthy - any configured impact sources and deploy verification that has determined a deploy is Unhealthy
Ailing - any configured impact sources and deploy verification that has determined a deploy is Ailing

Feature flags and Change failure rate

Sleuth supports feature flags as a first class form of change. Because feature flag changes have just as much power to affect failure as code changes, feature flag changes are included in your change failure rate calculations. Sleuth's deploy verification applies to flag changes in the same way it applies to code deploys.

Every deployment, feature flags included, has an advanced setting that allows you to exclude it from impact collection. If this is enabled, then feature flags will not affect your change failure rate.

Setting up Change failure rate

Sleuth's Change failure rate is configured and calculated at the Project level, and Sleuth also provides visibility into change failure for individual Teams (i.e. across all projects to which a team has contributed). By default Sleuth considers any deploys marked as Unhealthy as a failure. You can change the failure level in your project settings. If you would like to count only Incidents as failure, for example, then set the failure level to Incident.

Sleuth's deploy verification allows you to integrate error trackers, such as Sentry and Rollbar, metrics trackers, like AWS CloudWatch and Datadog, and incident trackers, like Statuspage and Pagerduty (see Integrations for a full list of currently supported integrations). When Sleuth auto-verifies a deploy as Unhealthy that deploy is considered a failure. Setting a deploy to Unhealthy manually will also be considered a failure. Sleuth also supports code deploy rollbacks. Rolled back deploys also count as change failure.

When configuring Change failure rate you'll want to determine what failure means to your project. Sleuth is flexible and allows you to define whatever failure criteria works for your projects. Once configured at the project level, change failure rate is also viewable by contributing teams. Just keep in mind that the failure data Sleuth provides is only as good as the data coming in.

Change failure breakdowns

Feature flags and Change failure rate

Setting up Change failure rate

Further Reading