1. 程式人生 > >A Response to “Why Most Unit Testing is Waste”

A Response to “Why Most Unit Testing is Waste”

My take: the arguments are right, it’s just the conclusion that’s wrong.

Just because many people still don’t do unit testing right doesn’t mean unit testing in itself should be dismissed as a useful technique.

When you practice TDD, unit tests are a byproduct, not a goal in themselves. OTOH, once written, the cost to keep them around is minimal, when compared to an error they’d fail to detect – I don’t recall if it’s “Code Complete” or “Writing Solid Code” that states that a bug unfixed in one phase of development is ten times as expensive to fix in the next. So sure, you _should_ regularly prune your unit tests, and remove the ones which are no longer relevant, but why remove unit tests which do not fail, but still do test something relevant, as long as they’re so cheap to maintain?

The main benefit of unit tests isn’t that they test your application. It’s that they provide the developer a really neat, controllable, closed scope environment for testing little bits of functionality. Picture this: you add a new piece to your code – let’s talk about one that isn’t very algorithmic – to signal changes in some model object by putting messages into a message queue. What you need to test are two things: that messages get placed into the queue when the business rules say they should (even if the message queue is far away from what the business talk about, it’s still a business requirement that arbitrary clients should be able to listen to model changes, just translated into a lower level language), and that the message payload correctly reflects the model changes. Automated tests at a higher level than unit testing would need to attach a message consumer to the message queue. Now picture the poor programmer testing this _without_ unit tests: firing up a local instance of the message queue, potentially without the ability to enter its code with the debugger, firing up an application server from a debugger, and writing a message consumer solely for testing purposes, which does all the inspection and assertion on message structure, and on whether all expected messages were received and no unexpected message was received in the process. And imagine how smart this consumer would need to be in order to be able to detect legitimate messages which aren’t generated by the test. And imagine how lengthy the debugging cycle would be. Compare this to calling the code you want to test from within a completely mocked and controlled environment. Obviously, the unit test using mocks throughout doesn’t say much about your entire application’s correctness, but it surely helps you debug code faster. Besides, building up a correctly functioning application from bricks which are working correctly, for a given value of “correctly”, is more likely than hitting the right spot with components you have no idea if they’re correct or not, for the same value of “correctly”.

Put in a more synthetic way, heavily unit-tested code makes all bugs found by higher level tests shallow and easy to diagnose and fix. Having higher level tests expose some obscure corner cases in very fine-grained, low level components doesn’t help much in fixing those bugs. Therefore, it’s much more likely to build a reliable application from a heavily unit-tested code base without many higher level tests than it is to build a highly reliable application from an application having a similar amount of higher level tests – specifically because the combinatorial complexity of the testing problem explodes as the system under test grows in size.

The assumption that cheap and massive computing power makes it cheap enough for programmers to do most testing and debugging with high level tests is wrong. Running the full suite of functional and integration tests on a program may require ten or a hundred times more time than running the compile + unit tests cycle. Sure, since computing power is so cheap, the cost of running the higher level tests is still close to nothing. But for the programmer having a debug cycle of a second (the time required to hit a breakpoint with a debugger when using a unit test) versus one of two minutes (when trying to do the same in a non-trivial application using a feature test) isn’t at all similarly cheap.

An analogy (a way less than perfect one, I agree, but still one which illustrates my point, I’d say): cars are built from parts. Each part – really, even simple components such as screws – is tested to some extent before the entire car is assembled. Imagine what it would mean, in terms of cost, to skip this entire parts testing, and only do an extensive test drive once the car is assembled.

Yes, most unit testing is waste, if you only look at it from the point of view of ensuring application quality. It’s not at all waste if you look at unit tests as a development tool.

And there’s one more reason not to give up on extensive unit testing, even if it mostly looks like waste when measure exclusively against overall application quality. It has to do with how the software development process works.

Software development, IMO, is all about successive translations and enrichments of some piece of information. First, the customer tells you what’s hurting – he states the problem. Then, you (or a BA) model a solution in terms of the business domain. Then you translate this in terms of an emerging solution domain (DDD ringing a bell?). Finally, you are ready to start coding – the code being nothing else than a rephrasing of the solution in terms of the business domain, just in another language. Why would you not start using a language as precise as a programming language as early as possible? If you do this, you end up with unit tests before integration tests – and your unit tests actually reflect business rules. Keeping them around helps you have the compiler test your translation of the business domain model to the solution domain model upon each build. True, you still need to manually and mentally regularly verify that this translation is still valid, but that’s a much smaller jump than the one from business domain rules written in natural language to actual implementation – thus, a lot less of a risk for costly human errors.