Metric Mind Tricks

“What gets measured, gets done”.

We all know this. By the way, did you notice that it says: “done”, not “done well”?

We like metrics, because we base decisions on them. If we didn’t have them, then we would just be guessing.

Yet sometimes, metrics deceive us. Well not exactly, it’s our interpretation of them that deceives us.

I’ve talked about misusing velocity before, so this time I’ll point at something else, more substantial and scientific.

Code coverage, anyone?

Code coverage is very scientific. It counts if the program has visited a line of code or not. So it can’t be wrong, right?

Remember it’s not a wrong metric, it’s what we do with it.

0% code coverage is not good, even I agree.

How about 100%?

Not achievable, you say? Sometimes, but let’s say we can do it. Do we want to be 100% covered? Because that means a lot of effort. Testing, like everything in the universe, goes by the 80-20 rule. That means that most of the effort will be put into the last 20% of the code. It may not be the most important code to test, because not all code is created equal.

Ok, so 100% coverage may be achievable, but costly. How about 80%? How about not letting anyone check-in code if it’s not 80% covered?

Which 80%? Does it include auto-generated code? Does it include the important, risky, bug-ridden, not reviewed code?

What does 80% (or any number) actually mean?

Not a lot by itself. If you take other considerations into it, the number’s meaning gets clearer, and that gives a  better basis for decision making. That’s the problem with straight metrics – we focus on them and forget other things that can help us make better decisions.

Always ask “what does this number really mean?” and “do I need to consider something more?”. Then make decisions.

One last thing…

I’m giving a talk on Coverage Lies in the next DevCon TLV, June 20th, and in a Typemcok webinar on June 27th. If you want to learn more about these tricks, you’re invited!

3 comments on “Metric Mind Tricks”

  1. Guy Nachimson Reply

    I wonder if there’s a good way to consider code churn in a systematic way (after ignoring generated code, perhaps). Code that changes often is most likely to be valuable for high test coverage IMO.

  2. Mr. Tines Reply

    Talking .net here, at least —

    Not all lines of code are created equal — a simple property (getter/setter, possibly even automatic compiler generated code) counts code points as much as that fiddly algorithm. And while it’s trivial to test that property code to 100% coverage, it’s not a good use of dev cycles to even write those tests.

    My preferred approach is to combine raw coverage data with static analysis that weeds out the trivial bits of code, and declarative admissions of uncoverage, so that code is either covered, or there is a reviewable reason as to why it isn’t (e.g. generated by a system tool, is the database calling code mocked elsewhere, subject to static analysis for correctness, whatever), rather than guessing a coverage percentage and hoping.

Leave A Reply

Your email address will not be published. Required fields are marked *