Test Attribute #6 - Maintenance

This the 6th post about Test Attributes that started off with the supermodel of series,  “How to test your tests” post. If you need training on testing, contact me.

I always hated the word “maintainability” in the context of tests. Tests, like any other code are maintainable. Unless there comes a time, where we decide we can’t take it anymore, and the code needs a rewrite, the code is maintainable. We can go and change it, edit or replace it.

The same goes for tests. Once we’ve written them, they are maintainable.

So why are we talking about maintainable tests?

The trouble with tests is that they are not considered “real” code. They are not production code.

Developers, starting out on the road to better quality, seem to regard tests not just as extra work, but also second-class work. All activities that are not directed at running code on production server, or a client computer, are regarded as “actors in supporting roles”.

Obviously writing the tests has an associated future cost. It’s a cost on supporting work, which is considered less valuable.

One of the reasons developers are afraid to start writing tests is the accumulated multiplier effect: “Ok, I’m willing to write the tests, which doubles my work load. I know that this code is going to change in the future, and therefore I’ll have to do double the work, many times in the future. Is it worth it?”

Test maintenance IS costly

But not necessarily because of that.

The first change we need to do is a mental one. We need to understand that all our activities, including the “supporting” ones, are all first-class. That also includes the test modifications in the future: After all, if we’re going to change the code to support that requirement, that will require tests for that requirement.

The trick is to minimize the effort to a minimum. And we can do that, because some of that future effort is waste that we’re creating now. The waste happens when the requirements don’t change, but the tests fail, and not because of a bug. We then need to fix the test, although there wasn’t a real problem. Re-work.

Here’s a very simple example, taken from the Accuracy attribute post:

[Test]public void AddTwoPositiveNumbers_GetResult() { PositiveCalculator calculator = new PositiveCalculator(); Assert.That(calculator.Add(2, 2), Is.EqualTo(4)); }

What happens if we decide to rename the PositiveCalculator to Calculator?  The test will not compile. We’ll need to modify the test in order to pass.

Renaming stuff doesn’t seem that much of a trouble, though – we’re relying on modern tools to replace the different occurrences. However, this is very dependent on tools and technology . If we did this in C# or in Java, there is not only automation, but also quick feedback mechanisms that catch this, and we don’t even think we’re maintaining the tests.

Imagine you’d get the compilation error only after 2 hours of compiling, rather than immediately after you’ve done the changes. Or only after the automated build cycle. The further we get from automation and quick feedback, we tend to look at the maintenance as a bigger monster.

Lowering maintenance costs

The general advice is: “Don’t couple your tests to your code”.

There’s a reason I chose this example: Tests are always coupled to the code. The level of coupling, and the feedback mechanisms we use effect how big these “maintenance” tasks are going to be. Here are some tips for lowering the chance of test maintenance.
  • Check outputs, not algorithms. Because tests are coupled to the code, the less implementation details the test knows about, the better. Robust tests do not rely on specific method calls inside the code. Instead, they treat the tested system as a black box, even though they may know how it’s internally built. These tests, by the way, are also more readable.

  • Work against a public interface. Test from the outside and avoid testing internal methods. We want to keep the internal method list (and signature) inside our black box. If you feel that’s unavoidable, consider extracting the internal method to a new public object.

  • Use the minimal amount of assert. Being too specific in our assert criteria, especially when using verification of method calls on dependencies, can lead to breaking tests without a benefit. Do we need to know a method was called 5 times, or that it was called at least once? When it was called, do we need to know the exact value of its argument, or maybe a range suffices? With every layer of specificity, we’re adding opportunities for breaking the test. Remember we with failure, we want information to help solve the problem. If we don’t gain additional information from these asserts, lower the criteria.

  • Use good refactoring tools. And a good IDE. And work with languages that support these. Otherwise, we’re delaying the feedback on errors, and causing the cost of maintenance to rise.

  • Use less mocking. Using mocks is like using x-rays. They are very good at what they do, but over-exposure is bad. Mocks couple the code to the test even more. They allow us to specify internal implementation of the code in the test. We’re now relying on the internal algorithm, which can change. And then our test will need some fixing.

  • Avoid hand-written mocks. The-hand written ones are the worst, because unless they are very simple, it is very easy to copy the behavior of the tested code into the mocks. Frameworks encourage setting the behavior through the interface.
There’s a saying: Code is a liability, not an asset. Tests are the same – maintenance will not go away completely. But we can lower the cost if we stick to these guidelines.

Next up: Footprint.

For training and coaching on testing and agile, contact me.

Image source: http://www.business2community.com/content-marketing/how-super-mario-would-market-his-plumbing-business-in-2013-0423630#!bnPrC4

Test Attribute #5 – Differentiation

This the 5th post about Test Attributes that started off with celebrity-level “How to test your tests” post.

Differentiation is not an attribute of a single test. Differentiation does not ride alone, because it requires multiple tests.

Tests allow us to (a) know something is wrong and (b) help us locate that something. We want to plant lots of clues for our future us (or another code victim of ours) who will need to analyze and solve the problem. For us to do this, and I really hate doing it, I’ll raise the ghost of our fallen enemy: Waterfall.

Years ago, when I visited water-world, I use to write SDDs. These were the dreaded Software Detailed Design documents, and we wrote them for both features and components. Of course, nobody liked them, their templates, the weird words in the beginning and they even smelled funny. But…
They had one thing going for them: In order to write one, you had to think about what you’re going to do. (Sounds like the biggest benefit of TDD, right?). Once we reviewed the documents, it was a good starting point to ask “what if” questions. What happens in the case of disconnect? What if the component is not initialized in time?
As part of our learning, at one point we even added a test-case description to the doc, so the writer needed to also think up front of all the cases he needed to check, and we could review those too. The list also served as a check list for the implementer to test.

Back to the future

That’s right, waterfall was evil, but sometimes had some good parts in its heart. We usually give BDUF (big design up front) a bad rep, but really, it’s the effort in documentation that bothers us, not the thinking up front. Scientists have proven that thinking about something before doing it correlates to its success. Imagine that.
TDD tells us to focus on the current test. The hardcore guys take that to the extreme, but in fact, it’s really hard to do.
While we’re doing one scenario, we’re still thinking about the other “what ifs”. If we’re not doing TDD, and writing code first, as we code we’re thinking about those “what ifs”.
And we should embrace the way we think, and make the most of it.

Baking Differentiation In

We’re already doing the thinking about the scenarios, and what makes them different from each other. All we have to do now is to make sure we leave the breadcrumb trail of our thoughts.
  • Group the test cases. Put all relating cases in one place, and separate from others. Put them in a separate class/file and give it a distinct group name. Yes, even if it there are other tests for that method – remember, convention should help us be effective, not restrict us because it’s there.
  • Review the test names as a group. First, look for missing cases, and if there are - write tests for them. Review the names in the group individually and see if they complement each other. If the names overlap, move the distinction to the left, so you can differentiate between them if the test runner does not show the entire name.
  • Review the test body. Sometimes, we “cover” the code as part of the setup for the test, and what differentiates are actual settings that differ between tests. Make the tests reflect that: separate the common setup lines from the differentiation setting and action. You can also try (but may not always succeed) to extract a common setup, and have the remaining, distinct lines remain in the test.
  • Review the covered code. You may even leave hints in the code itself, by matching names of variables and functions in the tested code to naming used in the test. However, much like stale comments, this can go bad, if things don’t get updated when refactoring. Use at your own risk.
In order to analyze a problem when tests fail, we need to get into detective mode. The more evidence we have, the better. With enough differentiation, we can get a mental model of what works and what doesn’t, and better – where the problem might lurk, so we can go on and fix it.

Next up: Maintenance.

For training and coaching on testing and agile, contact me.

Agile Practitioners 2015: Call For Papers

Agile Practitioners 2015 is starting its way, and the first step is the Call For Papers!

The Agile Practitioners conference started 4 years ago, and is an actual effort by the community. I’m proud to be part of the organizing committee, after presenting at the last 3 gatherings.

AP15 is continuing the tradition of bringing Israeli and international speakers to the growing agile community in Israel. This year won’t be any different. Much.

We’ve decided on 3 tracks, but this time the topics center around experience:

  • Beginner track, for those are venturing into the agile world, and want both general information and tips how not to mess up.
  • Advanced track, where experienced practitioners talk about their experience, both good and bad.
  • Executive track, that leaves the product development world, and makes agile accessible to other organizations like marketing, or HR.

Here’s your part

If you want to be a speaker, answer the call!

If you have an idea, you want us to help you make into a session, answer the call!

If you have another format for a session we didn’t think of, answer the call!

If you know someone who may want to give a talk, share this with them!

There’s be more news coming on sessions and speakers. In the meantime, spread the word and join us!

Test Attribute #4 - Accuracy

This is the 4th post on test attributes that were described in the now even more famous “How to test your tests” post. If you want training and/or coaching on testing, contact me.


Accuracy is about pinpointing the location of the the failing code. If we know where the offending code is, we can easily analyze what problem we caused, and move on to fixing the problem.

The trivial example is tests that check different methods. Of course, if one of them fails, we know where to look.

Here’s another simple case, on the same method. We have a PositiveCalculator class that its Add method adds two positive numbers,  or throws an exception if they are not so positive:

public int Add(int a, int b) { if ((a < 0) || (b < 0)) throw new ArgumentException(); return a + b; }

We can then write the following tests:

[Test]public void AddTwoPositiveNumbers_GetResult() { PositiveCalculator calculator = new PositiveCalculator(); Assert.That(calculator.Add(2, 2), Is.EqualTo(4)); } [Test]public void AddTwoNegativeNumbers_Exception() { PositiveCalculator calculator = new PositiveCalculator(); Assert.Throws<ArgumentException> (() => calculator.Add(-5, -5)); }

Looking at the tests, we already see they check two different behaviors. When we combine what we read from the tests, and the tested code, it’s easy to relate the parts of the code to each tests. So if one of them fails, we’ll know where to look.

Unfortunately code doesn’t always look like this.  It usually starts like that, but then grows to monster-size functions. When it does, it either becomes untestable, or incurs tests that are large, overlap each other, and test multiple things. None of those are accurate tests.

So what can we do?

Let’s start with the preventive measure: Don’t let the code grow. Be merciless about keeping methods small, and use The Single Responsibility Principle to extract code into smaller, easy testable and accurate functions.

But I didn’t write this code!

How do I make my tests accurate?

Here’s what you can do. Now that you have a test, or a bunch of them, it’s time to make use of them: Start refactoring the code. Having the tests in place, will tell you if you’re breaking stuff, and it’s very easy going back to working mode, because refactoring is also done in small steps.

Once you have broken the code into smaller pieces, you can now write smaller tests, which give  you the accuracy that the bigger tests didn’t have. In fact, you might want to replace the big tests with some smaller ones, if they give better information and performance for the same coverage.

We can also make the tests more accurate with the following methods:
  • One Assert per test – When you check only one thing, chances are that your test is more accurate than when checking multiple things. If you have more Asserts in your tests, break them into multiple tests.
  • Test shorter scenarios – In legacy code, it’s tempting to test large scenarios, because the code does a lot, and does not expose entry points to single operations. Try to test shorter scenarios rather than long ones, and smaller objects rather than large one. Try to break long scenarios into short ones. If you use the big tests to refactor the code, you can then write smaller, more accurate tests.
  • Mock unrelated stuff- If you have dependencies that do multiple things, and therefore make longer scenarios, mock them. You’ll make the test more accurate because it now runs through the relevant code you’re interested in.
  • Check the coverage – Visually if possible. IDEs and tools that show visual coverage on the code are awesome, because they add another visual clue to where the impacted code is. On trivial code they don’t matter much, but on complex code, you can compare paths of different tests, and by applying some elimination, you can find out where the problems are. You can also use the visual paths as feedback to how accurate your tests are, and if they aren’t, make them more so.
Accuracy helps us fix problems quickly. But it’s definitely not so easy to come by, because it depends very much on the tested code. However, using the combination of the methods I suggested, and making use of working to test to refactor and simplifications, test accuracy is definitely within reach.

Next up: Differentiation.
For training and coaching on testing and agile, contact me.

Test Attribute #3 - Speed

This is the 3rd post on test attributes that were described in the now more famous “How to test your tests” post.
  
There’s a story I like to tell about my first TDD experience. You’ll have to hear it now (some of you for the n-th time).

It was many moons ago,  when I just completed reading Kent Beck’s excellent “Test Driven Development By Example”. And I thought: This would end all my misery.

I was working on a communication component at the time, and I thought, why not use this new TDD thing?

I’ve already committed one foul ahead of writing a single line of test code, because I knew  that I was going to use MSMQ for the component. So I decided on the design instead of letting the tests drive it. My level of understanding of TDD at the time is not relevant for this story. MSMQ however, is.

For those who don’t know, MSMQ is Microsoft Queuing service, that runs on all kinds of Windows machine. An infrastructure for asynchronous messaging, that seem perfect for the job. It is, however, a bit slow.

So for my first test, I wrote a test that writes to the queue and waits to receive it from it. Something like this:

[TestMethod] public void ReceiveSentMessage() { MyQueue myqueue = new MyQueue(); myqueue.SendMessage(new Message("Hi")); Message receivedMessage = myqueue.Receive(); Assert.AreEqual("Hi", receivedMessage.Body); }
Since we’re talking about speed, here’s the thing: This single test ran around 3 seconds. What happens if I had a hundred more like it?

The Death Spiral Of Slow Tests

I was so happy I had a passing test, I didn’t notice that it took a few seconds to run. Most people starting out with unit testing don’t notice
that. They keep accumulating slow tests to their suite, until one day they reach a tipping point.

Let’s take, for example, a suite that takes 15 minutes to run. And let’s say I’m a very patient person. I know, just work with me.

Up to this point I had no problem running a full suite every hour.

Then, at that 15 minute point, I decide that running the suite every hour cripples my productivity. So I decide that I’ll run the tests twice a day. One run will be over lunch, and the 2nd will start as I go out of the office. That way I won’t need to wait on my time, the results will be there when I get back to work.

That leaves me more time to write code (and hopefully some tests). So I write more code, and when I get back from lunch, there are a few red tests. Since I don’t know exactly what’s wrong (I can’t tell exactly which parts of the big chunks of code I added productively are the ones to blame), I’ll spend an hour debugging the failing tests. And repeat that tomorrow morning, and the next lunch break.

Until I realize that I now spend 2 hours a day working on fixing tests. That’s 25% of my time working for my tests, instead of them working for me.

Which is where I stop writing more tests, because I see the cost, and no value from them. And then I stop running them, because, what’s the point?

I call it “The Death Spiral Of Doom”, and many developers who start doing testing, fall downstairs. Many never climb up again.

If we reverse the process, we’ll see quite the opposite. If my suite runs in a matter of seconds, or faster, I run it more often. When a test breaks, I know what caused the problem, because I know it was caused from a change I did in the last few minutes. Fixing it may not even require debugging, because it’s still fresh in my mind. Development becomes smoother and quicker.

Quick Feedback Is Mandatory

Tests should run quickly. We’re talking hundreds and thousands in a matter of seconds. If they don’t we’ll need to do something about them.

Quick feedback is not only an important agile property. It is essential for increasing velocity. If we don’t work at it, the entire safety net of our tests can come crashing down.

So what can we do?
  • Analyze. The length of tests is part of every test report, so it’s not even subjective. Look at those tests, and see which are the ones that take longer to run.

  • Organize.  Split the tests to slow running and quick running. Leave the slow running to a later automated build cycle, so you’ll be able to run the quick ones without penalty.

  • Mock. Mocking is a great way to speed up test. If a dependency (like my MSMQ service) is slow, mock it.

  • Groom. Not all the tests should be part of our automated build forever. If there’s a part of code that you never touch, but has 5 minute test suite around it, stop running it. Or run those on the nightly cycle.

  • Upgrade. You’ll be surprised how quicker better hardware runs your tests. The cost may be marginal compared to the value of quick feedback.
The key thing is ongoing maintenance of the test suite. Keep analyzing your suite, and you’ll see where you can optimize, without taking bigger risks.

The result is a quick safety net you can trust.

Next up: Accuracy.
For training and coaching on testing and agile, contact me.

Test Attribute #2 - Readability

This is the 2nd post on test attributes that were described in the now famous “How to test your tests” post.
We often forget the most of the value we get from tests come after we’ve written them. Sure, TDD helps design the code, but let’s face it, when everything works for the first time, our tests become the future guardians of our code. Once in place, we can change the code, hopefully for the better, knowing that everything still works.
But if (and when) something breaks, there’s work to be done. We need to understand what worked before that doesn’t now. We then need to analyze the situation: According to what we’re doing right now, should we fix this problem? Or is this new functionality that we now need to cover with new tests, throwing away the old one? Finally there’s coding and testing again, depending on the result of our analysis.
The more we move further in time, the tests and code get stale in our mind. Then we forget about them completely. The cost of the changes rises. The analysis phase becomes longer, because we need to reacquaint ourselves with the surroundings. We need to re-learn what still works, and what stopped working. Even if we knew, we don’t remember which  changes can cause side effects, and how those will work out.
Effective tests minimize this process. They need to be readable.
Readability is subjective.  What I find readable now (immediately after I wrote it), will not seem so in 6 months. Let alone to someone else.
So instead of trying to define test readability, let’s break it down to elements we care about, and can evaluate.

What A Name

The most important part of a test (apart from testing the right thing) is its name. The reason is simple: When a test breaks, the name is the first thing we see when the test fails. This is the first clue we get that something is wrong, and therefore, it needs to tell us as much as possible.
The name of a test should include (at least) the specific scenario and the expected result of our test. If we’re testing APIs, it should say that too.
For example:
@Testpublic void divideCounterBy5Is25() { ...
I can understand what the test does (a scenario about dividing Counter), the details (division by 5) and the expected result for this scenario (25).
If it sounds like a sentence – even better. Good names come from verbally describing them.
It doesn’t matter if you use capitalization, underscore, or whatever you choose. It is important that you use the same convention that your team agrees on.
Names should also be specific enough to mentally discern from other sibling tests. So, in our example:

@Testpublic void divideCounterBy0Throws() { ...
This test is similar enough to the first name to identify it as a “sibling” scenario, because of the resemblance in the prefix. The specific scenario and result are different. It is important, because when those two will appear together in the test runner, one fails and one doesn’t, it helps us locate the problem before even starting to debug. These are clues to resolve the problem.

What A Body

If our names don’t help locate the problem, the test body should fill the gaps. It should contain all the information needed to understand the scenarios.
Here are a few tips to make test code readable:
  • Tests should be short. About 10-15 lines short.
  • If the setup is long, extract it to functions with descriptive names.
  • Avoid using pre-test functions like JUnit’s @Before or MSTest [TestInitialize]. Instead use methods called directly from the test. When you look at the code, setup and tear down need to be visible, otherwise, you’ll need to search further, and assume even more. Debugging tests with setup methods is no fun either, because you enter the test in a certain context that may surprise you.
  • Avoid using base test classes.  They too hide information relevant for understanding the scenarios. Preferring composition over inheritance works here too.
  • Make the assert part stand out.
  • Make sure that body and name align.
Analysis takes time, and the worst kind (and slowest) requires debugging. Tests (and code) should be readable enough to help us bypass this wasteful process.

Better Readability In Minutes

We are biased and so think that we write code and tests that are so good, everyone can understand it. We’re often wrong.
In agile, feedback is the answer.
Use the “Law of the Third Ear”. Grab an unsuspecting colleague’s ear, and pull it close to your screen, and she can tell you if she understands what the test does.
Even better, pair while writing the tests. Feedback comes us a by product and you get better tests.
Whatever you do, don’t leave it for later. And don’t use violence if you don’t need to.
Make tests readable now, so you can read them later.


Next up: Speed.
For training and coaching on testing and agile, contact me.

Test Attribute #1 - Validity

In my last post, I created a list of test attributes. If one of them isn’t ok, you need to do some fixin’.
This is the first of a series of posts that is going to discuss the different faces of tests.
Let’s start with validity. Admittedly, it’s not the first of attribute I thought about. What are the chances we’re going to write a wrong test?

How can this happen?

We usually write tests based on our understanding of the code, the requirements we need to implement, and we fill the gap by assumptions.
We can be wrong on either. Or all.
A more interesting question is: How do we know we’ve written an incorrect test?
We find out we have the wrong tests in one or more ways:
  • The Review: Someone looks at our test and code, and tells us we’re either testing the wrong thing, or that it isn’t the way to prove our code works.
  • The Surprise: Our test surprisingly fails, where it should have passed. It can happen the other way too.
  • The Smell: We have a feeling we’re on the wrong path. The test passes, but something feels wrong.
  • The Facepalm: A penny drops, and we get  the “what the hell are you doing?” followed by “glad no one noticed” feelings.
Once we know what is wrong we can easily fix it. But wouldn’t it be easier if we avoided this altogether?
The easiest way is to involve someone. Pair programming helps avoid bugs and writing the wrong test. Obviously, in the Review a partner helps, but they can also avoid the Surprise, Smell and the Facepalm. And you don’t want them there when you have a Facepalm moment, right?
One of the foundation of agile development is feedback. We all make mistakes. In agile development we acknowledge that, so we put brakes in place, such as pair programming, to identify problems early.
Next up: Readability.

For training and coaching on testing and agile, contact me.
Related Posts Plugin for WordPress, Blogger...
Twitter Delicious Facebook Digg Stumbleupon More