Test Attribute #4 - Accuracy

This is the 4th post on test attributes that were described in the now even more famous “How to test your tests” post. If you want training and/or coaching on testing, contact me.


Accuracy is about pinpointing the location of the the failing code. If we know where the offending code is, we can easily analyze what problem we caused, and move on to fixing the problem.

The trivial example is tests that check different methods. Of course, if one of them fails, we know where to look.

Here’s another simple case, on the same method. We have a PositiveCalculator class that its Add method adds two positive numbers,  or throws an exception if they are not so positive:

public int Add(int a, int b) { if ((a < 0) || (b < 0)) throw new ArgumentException(); return a + b; }

We can then write the following tests:

[Test]public void AddTwoPositiveNumbers_GetResult() { PositiveCalculator calculator = new PositiveCalculator(); Assert.That(calculator.Add(2, 2), Is.EqualTo(4)); } [Test]public void AddTwoNegativeNumbers_Exception() { PositiveCalculator calculator = new PositiveCalculator(); Assert.Throws<ArgumentException> (() => calculator.Add(-5, -5)); }

Looking at the tests, we already see they check two different behaviors. When we combine what we read from the tests, and the tested code, it’s easy to relate the parts of the code to each tests. So if one of them fails, we’ll know where to look.

Unfortunately code doesn’t always look like this.  It usually starts like that, but then grows to monster-size functions. When it does, it either becomes untestable, or incurs tests that are large, overlap each other, and test multiple things. None of those are accurate tests.

So what can we do?

Let’s start with the preventive measure: Don’t let the code grow. Be merciless about keeping methods small, and use The Single Responsibility Principle to extract code into smaller, easy testable and accurate functions.

But I didn’t write this code!

How do I make my tests accurate?

Here’s what you can do. Now that you have a test, or a bunch of them, it’s time to make use of them: Start refactoring the code. Having the tests in place, will tell you if you’re breaking stuff, and it’s very easy going back to working mode, because refactoring is also done in small steps.

Once you have broken the code into smaller pieces, you can now write smaller tests, which give  you the accuracy that the bigger tests didn’t have. In fact, you might want to replace the big tests with some smaller ones, if they give better information and performance for the same coverage.

We can also make the tests more accurate with the following methods:
  • One Assert per test – When you check only one thing, chances are that your test is more accurate than when checking multiple things. If you have more Asserts in your tests, break them into multiple tests.
  • Test shorter scenarios – In legacy code, it’s tempting to test large scenarios, because the code does a lot, and does not expose entry points to single operations. Try to test shorter scenarios rather than long ones, and smaller objects rather than large one. Try to break long scenarios into short ones. If you use the big tests to refactor the code, you can then write smaller, more accurate tests.
  • Mock unrelated stuff- If you have dependencies that do multiple things, and therefore make longer scenarios, mock them. You’ll make the test more accurate because it now runs through the relevant code you’re interested in.
  • Check the coverage – Visually if possible. IDEs and tools that show visual coverage on the code are awesome, because they add another visual clue to where the impacted code is. On trivial code they don’t matter much, but on complex code, you can compare paths of different tests, and by applying some elimination, you can find out where the problems are. You can also use the visual paths as feedback to how accurate your tests are, and if they aren’t, make them more so.
Accuracy helps us fix problems quickly. But it’s definitely not so easy to come by, because it depends very much on the tested code. However, using the combination of the methods I suggested, and making use of working to test to refactor and simplifications, test accuracy is definitely within reach.

Next up: Differentiation.
For training and coaching on testing and agile, contact me.

Test Attribute #3 - Speed

This is the 3rd post on test attributes that were described in the now more famous “How to test your tests” post.
  
There’s a story I like to tell about my first TDD experience. You’ll have to hear it now (some of you for the n-th time).

It was many moons ago,  when I just completed reading Kent Beck’s excellent “Test Driven Development By Example”. And I thought: This would end all my misery.

I was working on a communication component at the time, and I thought, why not use this new TDD thing?

I’ve already committed one foul ahead of writing a single line of test code, because I knew  that I was going to use MSMQ for the component. So I decided on the design instead of letting the tests drive it. My level of understanding of TDD at the time is not relevant for this story. MSMQ however, is.

For those who don’t know, MSMQ is Microsoft Queuing service, that runs on all kinds of Windows machine. An infrastructure for asynchronous messaging, that seem perfect for the job. It is, however, a bit slow.

So for my first test, I wrote a test that writes to the queue and waits to receive it from it. Something like this:

[TestMethod] public void ReceiveSentMessage() { MyQueue myqueue = new MyQueue(); myqueue.SendMessage(new Message("Hi")); Message receivedMessage = myqueue.Receive(); Assert.AreEqual("Hi", receivedMessage.Body); }
Since we’re talking about speed, here’s the thing: This single test ran around 3 seconds. What happens if I had a hundred more like it?

The Death Spiral Of Slow Tests

I was so happy I had a passing test, I didn’t notice that it took a few seconds to run. Most people starting out with unit testing don’t notice
that. They keep accumulating slow tests to their suite, until one day they reach a tipping point.

Let’s take, for example, a suite that takes 15 minutes to run. And let’s say I’m a very patient person. I know, just work with me.

Up to this point I had no problem running a full suite every hour.

Then, at that 15 minute point, I decide that running the suite every hour cripples my productivity. So I decide that I’ll run the tests twice a day. One run will be over lunch, and the 2nd will start as I go out of the office. That way I won’t need to wait on my time, the results will be there when I get back to work.

That leaves me more time to write code (and hopefully some tests). So I write more code, and when I get back from lunch, there are a few red tests. Since I don’t know exactly what’s wrong (I can’t tell exactly which parts of the big chunks of code I added productively are the ones to blame), I’ll spend an hour debugging the failing tests. And repeat that tomorrow morning, and the next lunch break.

Until I realize that I now spend 2 hours a day working on fixing tests. That’s 25% of my time working for my tests, instead of them working for me.

Which is where I stop writing more tests, because I see the cost, and no value from them. And then I stop running them, because, what’s the point?

I call it “The Death Spiral Of Doom”, and many developers who start doing testing, fall downstairs. Many never climb up again.

If we reverse the process, we’ll see quite the opposite. If my suite runs in a matter of seconds, or faster, I run it more often. When a test breaks, I know what caused the problem, because I know it was caused from a change I did in the last few minutes. Fixing it may not even require debugging, because it’s still fresh in my mind. Development becomes smoother and quicker.

Quick Feedback Is Mandatory

Tests should run quickly. We’re talking hundreds and thousands in a matter of seconds. If they don’t we’ll need to do something about them.

Quick feedback is not only an important agile property. It is essential for increasing velocity. If we don’t work at it, the entire safety net of our tests can come crashing down.

So what can we do?
  • Analyze. The length of tests is part of every test report, so it’s not even subjective. Look at those tests, and see which are the ones that take longer to run.

  • Organize.  Split the tests to slow running and quick running. Leave the slow running to a later automated build cycle, so you’ll be able to run the quick ones without penalty.

  • Mock. Mocking is a great way to speed up test. If a dependency (like my MSMQ service) is slow, mock it.

  • Groom. Not all the tests should be part of our automated build forever. If there’s a part of code that you never touch, but has 5 minute test suite around it, stop running it. Or run those on the nightly cycle.

  • Upgrade. You’ll be surprised how quicker better hardware runs your tests. The cost may be marginal compared to the value of quick feedback.
The key thing is ongoing maintenance of the test suite. Keep analyzing your suite, and you’ll see where you can optimize, without taking bigger risks.

The result is a quick safety net you can trust.

Next up: Accuracy.
For training and coaching on testing and agile, contact me.

Test Attribute #2 - Readability

This is the 2nd post on test attributes that were described in the now famous “How to test your tests” post.
We often forget the most of the value we get from tests come after we’ve written them. Sure, TDD helps design the code, but let’s face it, when everything works for the first time, our tests become the future guardians of our code. Once in place, we can change the code, hopefully for the better, knowing that everything still works.
But if (and when) something breaks, there’s work to be done. We need to understand what worked before that doesn’t now. We then need to analyze the situation: According to what we’re doing right now, should we fix this problem? Or is this new functionality that we now need to cover with new tests, throwing away the old one? Finally there’s coding and testing again, depending on the result of our analysis.
The more we move further in time, the tests and code get stale in our mind. Then we forget about them completely. The cost of the changes rises. The analysis phase becomes longer, because we need to reacquaint ourselves with the surroundings. We need to re-learn what still works, and what stopped working. Even if we knew, we don’t remember which  changes can cause side effects, and how those will work out.
Effective tests minimize this process. They need to be readable.
Readability is subjective.  What I find readable now (immediately after I wrote it), will not seem so in 6 months. Let alone to someone else.
So instead of trying to define test readability, let’s break it down to elements we care about, and can evaluate.

What A Name

The most important part of a test (apart from testing the right thing) is its name. The reason is simple: When a test breaks, the name is the first thing we see when the test fails. This is the first clue we get that something is wrong, and therefore, it needs to tell us as much as possible.
The name of a test should include (at least) the specific scenario and the expected result of our test. If we’re testing APIs, it should say that too.
For example:
@Testpublic void divideCounterBy5Is25() { ...
I can understand what the test does (a scenario about dividing Counter), the details (division by 5) and the expected result for this scenario (25).
If it sounds like a sentence – even better. Good names come from verbally describing them.
It doesn’t matter if you use capitalization, underscore, or whatever you choose. It is important that you use the same convention that your team agrees on.
Names should also be specific enough to mentally discern from other sibling tests. So, in our example:

@Testpublic void divideCounterBy0Throws() { ...
This test is similar enough to the first name to identify it as a “sibling” scenario, because of the resemblance in the prefix. The specific scenario and result are different. It is important, because when those two will appear together in the test runner, one fails and one doesn’t, it helps us locate the problem before even starting to debug. These are clues to resolve the problem.

What A Body

If our names don’t help locate the problem, the test body should fill the gaps. It should contain all the information needed to understand the scenarios.
Here are a few tips to make test code readable:
  • Tests should be short. About 10-15 lines short.
  • If the setup is long, extract it to functions with descriptive names.
  • Avoid using pre-test functions like JUnit’s @Before or MSTest [TestInitialize]. Instead use methods called directly from the test. When you look at the code, setup and tear down need to be visible, otherwise, you’ll need to search further, and assume even more. Debugging tests with setup methods is no fun either, because you enter the test in a certain context that may surprise you.
  • Avoid using base test classes.  They too hide information relevant for understanding the scenarios. Preferring composition over inheritance works here too.
  • Make the assert part stand out.
  • Make sure that body and name align.
Analysis takes time, and the worst kind (and slowest) requires debugging. Tests (and code) should be readable enough to help us bypass this wasteful process.

Better Readability In Minutes

We are biased and so think that we write code and tests that are so good, everyone can understand it. We’re often wrong.
In agile, feedback is the answer.
Use the “Law of the Third Ear”. Grab an unsuspecting colleague’s ear, and pull it close to your screen, and she can tell you if she understands what the test does.
Even better, pair while writing the tests. Feedback comes us a by product and you get better tests.
Whatever you do, don’t leave it for later. And don’t use violence if you don’t need to.
Make tests readable now, so you can read them later.


Next up: Speed.
For training and coaching on testing and agile, contact me.

Test Attribute #1 - Validity

In my last post, I created a list of test attributes. If one of them isn’t ok, you need to do some fixin’.
This is the first of a series of posts that is going to discuss the different faces of tests.
Let’s start with validity. Admittedly, it’s not the first of attribute I thought about. What are the chances we’re going to write a wrong test?

How can this happen?

We usually write tests based on our understanding of the code, the requirements we need to implement, and we fill the gap by assumptions.
We can be wrong on either. Or all.
A more interesting question is: How do we know we’ve written an incorrect test?
We find out we have the wrong tests in one or more ways:
  • The Review: Someone looks at our test and code, and tells us we’re either testing the wrong thing, or that it isn’t the way to prove our code works.
  • The Surprise: Our test surprisingly fails, where it should have passed. It can happen the other way too.
  • The Smell: We have a feeling we’re on the wrong path. The test passes, but something feels wrong.
  • The Facepalm: A penny drops, and we get  the “what the hell are you doing?” followed by “glad no one noticed” feelings.
Once we know what is wrong we can easily fix it. But wouldn’t it be easier if we avoided this altogether?
The easiest way is to involve someone. Pair programming helps avoid bugs and writing the wrong test. Obviously, in the Review a partner helps, but they can also avoid the Surprise, Smell and the Facepalm. And you don’t want them there when you have a Facepalm moment, right?
One of the foundation of agile development is feedback. We all make mistakes. In agile development we acknowledge that, so we put brakes in place, such as pair programming, to identify problems early.
Next up: Readability.

For training and coaching on testing and agile, contact me.

How To Test Your Tests

When we write tests, we focus on the scenario we want to test, and then write that test.
Pretty simple, right?
That’s how are are minds work. we can’t focus on many things at the same time. TDD acknowledges that and its incremental nature is built around it.
TDD or not, when we have a passing test, we should do an evaluation.
Start with this table:
Property
Description
Validity Does it test a valid scenario? Is this scenario always valid?
Readability Of course I understand the test now, but will someone else understand the test a year from now?
Speed How quickly does it run? Will it slow down an entire suite?
Accuracy When it fails, can I easily find the problem is in the code, or do I need to debug?
Differentiation How is this case different than its brothers? Can I understand just by looking at the tests?
Maintenance How much work will I need to do around this test when requirements change? How fragile is it?
Footprint Does the test clean after itself? Or does it leave files, registry handles, threads, or a memory blob that can affect other tests?
Robustness How easy it is to break this test? What kind of variation are we permitting, and is that variation allowed?
Deterministic Does this test have dependencies (the computer clock, CPU, files, data) that can alter its result based on when or where it runs?
Isolation Does the test rely on a state that was not specified explicitly in it? If not, will the implicit state always be true?

If something’s not up to your standards (I’m assuming you’re a high standard professional) fix it.
Now, I hear you asking: Do all this for every test?
Let’s put it this way: If the test fails the evaluation, there’s going to be work later to fix it. When would you rather do it – now, when the test is fresh in your head, or later, when you have to dive in again, into code that you haven’t seen in 6 months, instead of working on the new exciting feature you want to work on?
It’s testing economics 101. Do it now.

For training and coaching on topics like this, contact me.

Agile VS Real Life

The Agile Manifesto tells us that:

We have come to value “Individuals and Interaction over Processes and Tools” 

Reality tells us otherwise.

Want to do unit testing? Pick up a test framework and you’re good to go.

Want your organization to be agile? Scrum is very simple, and SAFe is scaled simplicity.

We know there are no magic bullets. Yet we’re still attracted to pre-wrapped solutions.

Why?

Good question. We’re not stupid, most of us anyway. Yet we find it very easy to make up a story about how easy it’s going to be.

Here are a couple of theories.

  • We’re concentrating on short term gains. Whether it’s the start-up pushing for a lucrative exit by beating the market, or the enterprise looking at the next investor call, companies are pushing their people to maximize value in the short term. With that in mind, people look for a “proven” tool or process, that minimizes long term investments. In fact, systems punish people, if they do otherwise.
  • We don’t understand complexity. Think about how many systems we’re part of , how they impact each other, and then consider things we haven’t thought about. That’s overwhelming. Our wee brain just got out of fight-or-flight mode, you want it to do full plan and execution with all those question marks? People are hard. Better get back to dry land where tools and processes are actually working.
  • We’re biased in so many ways. One of our biases is called anchoring. Simply put, if we first hear of something, we compare everything to it. It becomes our anchor. Now, when you’re researching a new area, do you start with the whole methodology? Nope. We’re looking for examples, similar to our experiences. What comes out first when we search? The simple stuff. Tools and processes. Once we start there, there’s no way back.
  • We don’t live well with uncertainty. Short term is fine, because we have the illusion of control over it. Because of complexity, long-term is so out of our reach we give up, and try to concentrate short term wins. 
  • We don’t like to read the small print. Small print hurts the future-perfect view. We avoid the context issues, we tell ourselves that the annotation applies to a minority of the cases, which obviously we don’t belong to.  Give us the short-short version, and we’ll take it from there.
  • We like to be part of the group. Groups are comfy. Belonging to one removes anxiety. Many companies choose scrum because it works for them, why won’t it work for me? The only people who publish big methodology papers are from the academia. And that’s one group we don’t want to be part of, heaven forbid.

That’s why we like processes and tools. Fighting that is not only hard, but may carry a penalty.

So what’s the solution?

Looking for simplicity again? So soon?

Well, the good news, is that it is possible to do that with discipline. If we have enough breathing room, if we don’t get push back from the rest of our company, if we acknowledge that we need to invest in learning , and understand that processes and tools are just the beginning – then there’s hope for us yet.

Lots of if’s.

But if you don’t want to bother, just go with this magical framework.

More #NoEstimates

Quite an interesting conversation and reaction to the #NoEstimates post. Good questions too, and frankly, to some I don’t have answers. I’ll try, anyway.

Let’s start with classic project management. It tells us that in order to plan, we need to estimate cost and duration. Estimation techniques have been around for a while.

There’s a problem with the assumption that we can “know” stuff.  We can’t know stuff about the future. Guessing, or estimating, as we call it is the current alternative. To improve, we can at most try to forecast. And we want a forecast we can trust enough to make further plans on. If confidence is important then:

Sounds easy enough…

Estimating is a skill. It takes knowledge of process and ability to deduce from experience. As with other skills, you can improve your estimations. It works well, if the work we’re doing is similar to what you did before. However, if history is different than the future, we’re in trouble. In my experience, it usually is. Variations galore.

In the projects I was involved in, there were plenty of unknowns: technology, algorithms, knowledge level, team capacity and availability, even mood. All of those can impact delivery dates, and therefore the “correctness” of estimations.

With so many “unknown unknowns” out there, what’s the chance of a plausible estimation? We can definitely estimate the “knowns”, try to improve on the “known unknowns”, but it’s impractical to improve on estimating that part.

Yet the question remains

Ok, wise-guy, if estimating can yield lousy results, what’s the alternative?

Agile methodologies take into account that reality is complex, and therefore involve the feedback loop in short iterations. The product owner can decide to shut down the project or continue it every cycle.

I think we should be moving in that direction at the organizational level. Instead of trying to predict everything, set short-term goals and check points. Spend small amount of money, see the result, then decide. Use the time you spent on estimating to do some work.

Improving estimates is a great example of local optimization. After all, the customer would rather have a prototype in the hand, than a plan on the tree.

And if he wants estimates? Then we will give a rough estimate, that doesn’t cost much.

I know project managers won’t like this answer. I know a younger me wouldn’t either.

But I refer you to the wise words of the Agile Manifesto, which apply to estimating, among other things:

We are uncovering better ways of developing
software by doing it and helping others do it.

There are better ways. We’ll find them.

Related Posts Plugin for WordPress, Blogger...
Twitter Delicious Facebook Digg Stumbleupon More