The word "legacy" has a lot of connotations. Mostly bad ones. We seem to forget that our beautiful code gets to “legacy“ status three days after writing it. Michael Feathers, in his wonderful book “Working Effectively With Legacy Code” defined legacy code as code that doesn't have tests, and there is truth in that, although it took me a while to fully understand it.
Code that doesn't have tests rots. It rots because we don't feel confident to touch it, we're afraid to break the "working" parts. Code rotting means that it doesn't change, staying the way we first wrote it. I'll be the first to admit that whenever I write code, it comes in its ugly form. It may not look ugly immediately after I wrote it, but if I wait a couple of days (or a couple of hours), I know I will find many ways to improve it. Without tests I can rely either on the automatic capabilities of refactoring tools, or pure guts (read: stupidity).
Most code doesn't look nice after writing it. But nice doesn't matter.
Because code costs, we'd like it to help us understand it, and minimize debugging time. Refactoring is essential to lower maintenance costs, and therefore tests are essentials.
And this is where you start paying
The problem of course, is that writing tests for legacy code is hard. Code is convoluted, full of dependencies both near and far, and without proper tests its risky to modify. On the other hand, legacy code is the one that needs tests the most. It is also the most common code out there - most of time we don't write new code, we add it to an existing code base.
We will need to change the code to test it, in most cases. Here are some examples why:
- We can't create an instance of the tested object.
- We can't decouple it from its dependencies
- Singletons that are created once, and impact the different scenarios
- Algorithms that are not exposed through public interface
- Dependencies in base classes of the tested code.
Some tools, like PowerMockito in Java, or Typemock Isolator for C# allow us to bypass some of these problems, although they too come with a price: lower speed and code lock-down. The lower speed come from the way these tools work, which makes them slower compared to other mocking tools. The code lock-down comes as a side effect of extra coupling to the tested code - the more we use the power tools' capabilities, they know more about the implementation. This leads to coupling between the tests and the code, and therefore make the tests more fragile. Fragile tests carry a bigger maintenance cost, and therefore people try not to change them, and the code. While this looks like a technology barrier, it manifests itself, and therefore can be overcome, by procedure and leadership (e.g., once we have the tests, encourage the developers to improve the code and the tests).
Even with the power tools, we'll be left with some work. We might even want to do some work up front. Some tweaks to the tested code before we write the tests (as long as they are not risky), can simplify the tests. Unless the code was written simple and readable the first time. Yeah, right.
We’ll need to do some of the following:
- Expose interfaces
- Derive and implement classes
- Change accessibility
- Use dependency injection
- Add accessors
- Extract method
- Extract class
Some of these changes to the code is introducing "seams" into it. Through these seams, we can enter probes to check the impact of the code changes. Other changes just help us make sense of it. We can if these things are refactoring patterns or not. If we apply them wisely, and more important SAFELY, we can prepare the code to be tested more easily and make the tests more robust.
In the upcoming posts I’ll look into these with more details.
Image source: http://memecrunch.com/meme/19PWL/exploring-legacy-code