A very common beginner mistake when writing recursive functions is to accidentally fire off completely redundant recursive calls from the same function. For example, consider this recursive function that finds the maximum value in a binary tree (not a binary search tree):
int BinaryTreeMax(Tree* root) {
if (root == null) return INT_MIN;
int maxValue = root->value;
if (maxValue < BinaryTreeMax(root->left))
maxValue = BinaryTreeMax(root->left); // (1)
if (maxValue < BinaryTreeMax(root->right))
maxValue = BinaryTreeMax(root->right); // (2)
return maxValue;
}
Notice that this program potentially makes two completely redundant recursive calls to BinaryTreeMax in lines (1) and (2). We could rewrite this code so that there’s no need for these extra calls by simply caching the value from before:
int BinaryTreeMax(Tree* root) {
if (root == null) return INT_MIN;
int maxValue = root->value;
int leftValue = BinaryTreeMax(root->left);
int rightValue = BinaryTreeMax(root->right);
if (maxValue < leftValue)
maxValue = leftValue;
if (maxValue < rightValue)
maxValue = rightValue;
return maxValue;
}
Now, we always make exactly two recursive calls.
My question is whether there is a tool that does either a static or dynamic analysis of a program (in whatever language you’d like; I’m not too picky!) that can detect whether a program is making completely unnecessary recursive calls. By “completely unnecessary” I mean that
- The recursive call has been made before,
- by the same invocation of the recursive function (or one of its descendants), and
- the call itself has no observable side-effects.
This is something that can usually be determined by hand, but I think it would be great if there were some tool that could flag things like this automatically as a way of helping students gain feedback about how to avoid making simple but expensive mistakes in their programs that could contribute to huge inefficiencies.
Does anyone know of such a tool?
First, your definition of ‘completely unnecessary’ is insufficient. It is possible that some code between the two function calls affects the result of the second function call.
Second, this has nothing to do with recursion, the same question can apply to any function call. If it has been called before with the exact same parameters, has no side-effects, and no code between the two calls changed any data the function accesses.
Now, I’m pretty sure a perfect solution is impossible, as it would solve The Halting Problem, but that doesn’t mean there isn’t a way to detect enough of these cases and optimize away some of them.
Some compilers know how to do that (GCC has a specific flag that warns you when it does so). Here’s a 2003 article I found about the issue: http://www.cs.cmu.edu/~jsstylos/15745/final.pdf .
I couldn’t find a tool for this, though, but that’s probably something Eric Lipert knows, if he happens to bump into your question.