jump to navigation

What is the Cost of a Defect? November 2, 2012

Posted by Peter Varhol in Software development.
Tags: ,
trackback

I have a lot of respect for Michael Bolton (the testing consultant, not the singer).  And I think he’s mostly right in his questioning of the accepted relationship between stage of the lifecycle and the cost of finding and fixing a defect.  Barry Boehm’s findings at TRW in the 1980s shouldn’t be taken as a hard and fast rule.  The most common expression of this rule is that the cost of finding and fixing a defect increases by an order of magnitude at each stage of the development lifecycle.

But . . .

Michael’s examples are intentionally trivialized.  It is certainly possible that a defect found just before product release can be shown to a developer and fixed in a few minutes.  But unless the fixed code was completely unrelated to anything else in the code base, there should at least be some regression testing done prior to release.  And most teams focusing on process improvement would want to understand why the defect appeared when it did.  Was the process at fault, or were there other changes to the code that uncovered this particular defect?  Were there test cases blocked until the very end, or did the team not have good test coverage?

Even in the trivial cases, it is possible to see that regression testing and root cause analysis take more time later in the development lifecycle.  That’s especially true if the final product is an integration of different components and libraries, which increases the overall complexity of the product once integration testing is complete.

In the vast majority of cases, a defect found later in the development lifecycle is more subtle and complex than average, and usually isn’t a simple forgotten statement or mistyping.  It’s very likely that the testers have found and tracked all of the defects that can be fixed in just a few minutes during the early parts of testing.  What is left is more difficult issues that require creating testing and extensive defect analysis before a fix can be implemented.  After that, it still requires regression testing and likely even some integration testing to verify the fix.

He’s right that we shouldn’t be slaves to the idea that finding and fixing a defect later is always more costly.  But there is no good reason to use that fact to become complacent about finding and fixing defects early.  If teams can leverage automation and process tools to find the difficult defects earlier, we are doing ourselves and our project a big favor.

Comments»

1. Michael Bolton - November 3, 2012

Hi, Peter…

I’m the last person to advocate sloppiness and inattention–though, like all humans, I’m sometimes vulnerable to it, and typically at the times I’m least likely to notice it. I didn’t say anything to advocate complacency. On the contrary: I’d like us to be at least as diligent about our thinking and our speech as we aspire to be about our aspects of our work.

For example, you suggest “The most common expression of this rule is that the cost of finding and fixing a defect increases by an order of magnitude at each stage of the development lifecycle.”

Apart from the fact that this is not what Boehm said but a broken-telephone version of it, a few moments’ thought should tell us that it can’t be right. First, how many stages are there in “the development lifecycle”? Your answer might be three, or five, depending on what you consider a “stage”–and that would change the outcome of an event based on this “rule” by a factor of 100. The cost of fixing a problem found in production might be 100 times that of fixing it in the requirements phase–or it might be 10,000. Second: does this “rule” apply to all defects? To the “average” defect? What would an average defect look like? What’s the variation on that average? Third, how are we measuring cost? Money? Time? Effort? If it’s effort, does a problem that we could have found in production really take 100, or 1000, or 10000 more units of work to fix later? Some do (and, as I mentioned, some take much more). Fourth, when we say “we could have noticed that in the requirements phase,” it’s important for us to add “had we known then what we know now”. Whatever the cost of change curve is, it’s always based on a counterfactual whether you found that bug sooner (“We just saved ourselves… well, I guess we’ll never know how much we saved ourselves.”) or later (“if only we had been clever enough to realize that before we realized it.”)

See, the trouble is that a myth that was pretty much bogus in the first place turns into a magic spell that inhibits change and experimentation and refinement. You know: development. Instead, I’d rather do what you’re doing above (and a whole lot more, too): looking consciously and thoughtfully at what really goes on in our products and our projects.

My examples are not “intentionally trivialized”. On the contrary; they’re examples of things that go on in projects all the time. They appear trivial because they appear to be little moments, but if we paid attention to them, we might recognize things about cost and significance. We’d remember that the cost of some change looks very low by one set of measures but is very high using a different set; and for another kind of change the cost might look significant at first but be cheap in the long run. Let’s challenge our assumptions and examine what we’re doing, instead of letting a formula (that’s not even a real formula) lull us to sleep.

Finally, I urge you to read Laurent Bossavit’s book, The Leprechauns of Software Engineering (http://www.leanpub.com/leprechauns), in which he examines the development of the cost-of-change myth.

Gerie Owen - November 3, 2012

Hi Peter,
Both Michael’s original blog and your blog have some very valid points, however; Michael’s response exposes a idea which I think is the heart of the issue: cost and significance. As Michael suggests the actual cost can be very different based on the set of measures used, and I will add, who is doing the measuring.

Last week, while I was on storm duty at the power company where I work, I had a unique opportunity to chat with the users of a vendor application I recently tested. My company was a beta site for this product and as expected, many defects were deferred including one entire piece of functionality. Turns out, not having this feature is a huge deal to the users in the field as they now need to use a gps along with this software. So the cost here multiplies every day as all the users spend extra time getting to their worksites. And that cost can be calculated.

But there another cost, and that is perceived value. When I mentioned to the users that I had tested this application, the very first thing they said was “It doesn’t work”. Only in further conversation did I find out that their real issue is that feature mentioned above. So the users perceive the entire applicaiton is useless and that is, albeit, intangible, an enormous cost.


Leave a comment