I have the following problem: I need to find pairs of the same elements in two lists, which are unordered. The thing about these two lists is that they are “roughly equal” – only certain elements are shifted by a few indexes e.g. (Note, these objects are not ints, I am just using integers in this example):
[1,2,3,5,4,8,6,7,10,9]
[1,2,3,4,5,6,7,8,9,10]
My first attempt would be to iterate through both lists and generate two HashMaps based on some unique key for each object. Then, upon the second pass, I would simply pull the elements from both maps. This yields O(2N) in space and time.
I was thinking about a different approach: we would keep pointers to the current element in both lists, as well as currentlyUnmatched set for each of the list. the pseudocode would be sth of the following sort:
while(elements to process)
elem1 = list1.get(index1)
elem2 = list2.get(index2)
if(elem1 == elem2){ //do work
... index1++;
index2++;
}
else{
//Move index of the list that has no unamtched elems
if(firstListUnmatched.size() ==0){
//Didn't find it also in the other list so we save for later
if(secondListUnamtched.remove(elem1) != true)
firstListUnmatched.insert(elem1)
index1++
}
else { // same but with other index}
}
The above probably does not work… I just wanted to get a rough idea what you think about this approach. Basically, this maintains a hashset on the side of each list, which size << problem size. This should be ~O(N) for small number of misplaced elements and for small “gaps”. Anyway, I look forward to your replies.
EDIT: I cannot simply return a set intersection of two object lists, as I need to perform operations (multiple operations even) on the objects I find as matching/non-matching
You can maintain a set of the objects which don’t match. This will be O(M) in space where M is the largest number of swapped elements at any point. It will be O(N) for time where N is the number of elements.