The situation is as follows:
- There are N arrays.
- In each array (0..N-1) there are (x,y) tuples (cartesian coordinates) stored
- The length of each array can be different
I want to extract the subset of coordinate combinations which make up a complete
retangle of size N. In other words; all the cartesian coordinates are adjacent to each other.
Example:
findRectangles({
{*(1,1), (3,5), (6,9)},
{(9,4), *(2,2), (5,5)},
{(5,1)},
{*(1,2), (3,6)},
{*(2,1), (3,3)}
})
yields the following:
[(1,1),(1,2),(2,1),(2,2)],
...,
...(other solutions)...
No two points can come from the same set.
I first just calculated the cartesian product, but this quickly becomes infeasible (my use-case at the moment has 18 arrays of points with each array roughly containing 10 different coordinates).
Let XY be your set of arrays. Construct two new sets X and Y, where X equals XY with all arrays sorted to x-coordinate and Y equals XY with all arrays sorted to y-coordinate.
Let C be the size of the largest array. Then sorting all sets takes time O(N*C*log(C)). In step 1, finding a single matching point takes time O(N*log(C)) since all arrays in X are sorted. Finding all such points is in O(C*N), since there are at most C*N points overall. Step 2 takes time O(N*log(C)) since Y is sorted.
Hence, the overall asymptotic runtime is in O(C * N^2 * log(C)^2).
For C==10 and N==18, you’ll get roughly 10.000 operations. Multiply that by 2, since I dropped that factor due to Big-O-notation.
The solution has the further benefit of being extremely simple to implement. All you need is arrays, sorting and binary search, the first two of which very likely being built into the language already, and binary search being extremely simple.
Also note that this is the runtime in the worst case where all rectangles start at the same x-coordinate. In the average case, you’ll probably do much better than this.