I’m using functions instead of classes, and I find that I can’t tell when another function that it relies on is a dependency that should be individually unit-tested or an internal implementation detail that should not. How can you tell which one it is?
A little context: I’m writing a very simple Lisp interpreter which has an eval() function. It’s going to have a lot of responsibilities, too many actually, such as evaluating symbols differently than lists (everything else evaluates to itself). When evaluating symbols, it has its own complex workflow (environment-lookup), and when evaluating lists, it’s even more complicated, since the list can be a macro, function, or special-form, each of which have their own complex workflow and set of responsibilities.
I can’t tell if my eval_symbol() and eval_list() functions should be considered internal implementation details of eval() which should be tested through eval()‘s own unit tests, or genuine dependencies in their own right which should be unit-tested independently of eval()‘s unit tests.
A significant motivation for the “unit test” concept is to control the combinatorial explosion of required test cases. Let’s look at the examples of
eval,eval_symbolandeval_list.In the case of
eval_symbol, we will want to test contingencies where the symbol’s binding is:missing (i.e. the symbol is unbound)
in the global environment
is directly within the current environment
inherited from a containing environment
shadowing another binding
… and so on
In the case of
eval_list, we will want to test (among other things) what happens when the list’s function position contains a symbol with:no function or macro binding
a function binding
a macro binding
eval_listwill invokeeval_symbolwhenever it needs a symbol’s binding (assuming a LISP-1, that is). Let’s say that there are S test cases foreval_symboland L symbol-related test cases foreval_list. If we test each of these functions separately, we could get away with roughly S + L symbol-related test cases. However, if we wish to treateval_listas a black box and to test it exhaustively without any knowledge that it useseval_symbolinternally, then we are faced with S x L symbol-related test cases (e.g. global function binding, global macro binding, local function binding, local macro binding, inherited function binding, inherited macro binding, and so on). That’s a lot more cases.evalis even worse: as a black box the number of combinations can become incredibly large — hence the term combinatorial explosion.So, we are faced with a choice of theoretical purity versus actual practicality. There is no doubt that a comprehensive set of test cases that exercises only the “public API” (in this case,
eval) gives the greatest confidence that there are no bugs. After all, by exercising every possible combination we may turn up subtle integration bugs. However, the number of such combinations may be so prohibitively large as to preclude such testing. Not to mention that the programmer will probably make mistakes (or go insane) reviewing vast numbers of test cases that only differ in subtle ways. By unit-testing the smaller internal components, one can vastly reduce the number of required test cases while still retaining a high level of confidence in the results — a practical solution.So, I think the guideline for identifying the granularity of unit testing is this: if the number of test cases is uncomfortably large, start looking for smaller units to test.
In the case at hand, I would absolutely advocate testing
eval,eval-listandeval-symbolas separate units precisely because of the combinatorial explosion. When writing the tests foreval-list, you can rely uponeval-symbolbeing rock solid and confine your attention to the functionality thateval-listadds in its own right. There are likely other testable units withineval-listas well, such aseval-function,eval-macro,eval-lambda,eval-arglistand so on.