I faced this question in an interview recently.
The original question was
Given a pointer to a struct (which is structured so that it can point either to a Binary tree or a doubly linked list), write a function which returns whether it is pointing to a binary tree or a DLL.The struct is defined like this
struct node
{
/*data member*/
node *l1;
node *l2;
};
I dived into the problem straightaway but then I realized there is some ambiguity in the problem. What if the pointer doesn’t points to either of them ( that is it is a malformed DLL or a malformed tree). So the interviewer told me that then I have to write the function such that it can return all three cases. So the return value of the function becomes an enum of the form
enum StatesOfRoot
{
TREE,
DLL,
INVALID_DATA_STRUCTURE, /* case of malformed dll or malformed tree */
EITHER_TREE_DLL, /* case when there is only 1 node */
};
So the problem reduced to verifying the property of binary tree and DLL.For DLL it was easy.
For binary tree the only verification that I could think was that there should not be more than one path to a node from the root.(Or there should not be any loops)
So I proposed that we do depth first search and keep tracking the visited nodes using either a HashMap(which the interviewer rejected straightaway) or maintaining a set of visited nodes using a BST (I wanted to use std::set but the interviewer suddenly popped up another restriction that I can’t use STL).He rejected this idea saying that I am not allowed to use any other data structure. Then I proposed a modified version of tortoise and hare problem ( Considering each branch of Binary tree as a singly link list) to which he said this won’t work.
After that I went on to propose few more solutions which were sort of ugly ( involved deleting nodes,maintaining a copy of tree etc)
The Core of the problem
Then the interviewer proposed his solution. He said we can count the number of vertices and number of edges and assert the relation number of vertices=number of edges +1 (A property which has to hold for a binary tree) . What baffled me was how can we count the number of vertices (without using any additional data structure )? He said It can be done by simply performing any traversal ( preorder,postorder,inorder ) . I questioned back how will we prevent an infinite loop if there is a loop in the tree since we are not tracking the visited nodes. He said this is possible but didn’t told how. I am seriously doubting his approach. Can anyone provide some insight on whether the solution proposed by him was right? If yes how would you explicitily maintain a count of distinct vertices? Note that what you are passed is just a pointer,you have no other information.
PS: Later I received a notification that I am through to the next round without even answering the final solution to the interviewer. Was it supposed to be trick round ?
EDIT :
Just to make things clear,if we assume that the 3rd case is not present (that is we are guaranteed its a dll or a binary tree)then the problem is very trivial.Its the tree part of the 3rd case that is driving me crazy. Kindly note this point while answering.
You are right to be skeptical of his solution.
Doubly-Linked list is the easy one. DLLs enforce the invariants:
The preceeding is easy to check with only an extra temporary variable, and walking over the DLL.
(Note: checking 3 and 4, or 5 may take a long time.)
Binary Tree is the hard one. BTs enforce the invariants:
As you suggested, these may be determined by traversing the tree and marking each node visited to ensure that no node gets visited twice, or alternatively storing a list of each node visited (such as in a hash-set or other structure) to quickly look-up if the node is distinct.
You could probably validate that there are no loops in the tree without another data structure, by simply traversing the tree and keeping a value of your current depth in the tree, if you got deeper in the tree than there is memory in the computer (or visited more nodes), you would be sure to have an infinite loop.
However, that doesn’t help us distinguish Binary “Directed Acyclic Graphs” (DAGs) from Binary Trees.
If, however, we knew the count of elements in the tree, as this is usually the case for Library implementations of binary trees. You could detect an infinite loop by counting the number of edges compared to the previously known number of nodes, like the interviewer suggested.
Without knowing that number ahead of time, it is difficult to know the difference between an infinitely large tree and a large finite tree. (Unless you know the memory size of the computer, or other information like how long it took to make the tree, etc.)
This still does not help us detect the “No Merges” invariant.
I can’t think of any useful way to determine that No Merges exist, without showing that no node is referenced twice by either storing visited nodes in an external data structure, or marking each node as visited when you visit it.
As a final resort, you could do the following:
This process would only take a few extra variables, but would take a lot of time, since you individually compare each node to every node higher or at the same depth in the tree.
My intuition tells me that the above procedure would a v-squared algorithm, instead of just being order v.
Add a comment if any of you think of another way to approach this.
Edit: you may be able to verify the “No Loops” here by simply extending the search to not just every node at same depth and higher, but comparing with every node in the tree. You would need to do this in a progressive algorithm, compare each node with every node above it in the tree and its own depth, then check against all nodes in the tree from 1 to 5 nodes deeper than it, then from 6-10 generations lower, and so forth. If you check in a non-progressive way, you could get stuck searching infinitely.