I have a function that performs a hierarchical clustering on a list of input vectors. The return value is the root element of an object hierarchy, where each object represents a cluster. I want to test the following things:
- Does each cluster contain the correct elements (and maybe other properties as well)?
- Does each cluster point to the correct children?
- Does each cluster point to the correct parent?
I have two problems here. First, how do I specify the expected output in a readable format. Second, how do I write a test-assertion accepts isomorphic variants of the expected data I provide? Suppose one cluster in the expected hierarchy has two children, A and B. Now suppose that cluster is represented by an object with the properties child1 and child2. I do not care whether child1 corresponds to cluster A or B, just that it corresponds to one of them, and that child2 corresponds to the other. The solution should be somewhat general because I will write several tests with different input data.
Actually my main problem here is to find a way to specify the expected output in a readable and understandable way. Any suggestions?
If there are isomorphic results, you should probably have a predicate that can test for logical equivalence. This would likely be good for your code unit as well as helping to implement the unit test.
This is the core of Manoj Govindan’s answer without the string intermediates and since you aren’t interested in string intermediates (presumably) then adding them to the test regime would be an unnecessary source of error.
As to the readability issue, you’d need to show what you consider unreadable for a proper answer to be given. Perhaps the equivalence predicate will obviate this.