I have millions of fixed-size (100) int arrays. Each array is sorted and has unique elements. For each array, I want to find all arrays which have 70% common elements. Right now I am getting around 1 million comparisons (using Arrays.binarySearch()) per second, which is too slow for us.
Can anyone recommend a better searching algorithm?
Something like this should do the job (provided that the arrays are sorted and contain unique elements):
Sample usage:
Output:
Premature Optimizations™
I have now tried to optimize the above code. Please check whether the code blocks marked as
make a noticeable difference. After optimization, I would refactor the code to get it down to one or two return statements.