This code takes 9 minutes to run for a set of 5,600 objects:
public Set<UnDirectedPair<T>> getAllUndirectedPairs(Set<T> setObjects) {
Set<T> setObjectsProcessed = new TreeSet();
Set<UnDirectedPair<T>> setPairs;
setPairs = new TreeSet();
Iterator<T> setObjectsIteratorA = setObjects.iterator();
Iterator<T> setObjectsIteratorB;
T currTA;
T currTB;
while (setObjectsIteratorA.hasNext()) {
currTA = setObjectsIteratorA.next();
setObjectsProcessed.add(currTA);
setObjectsIteratorB = setObjects.iterator();
while (setObjectsIteratorB.hasNext()) {
currTB = setObjectsIteratorB.next();
if (!setObjectsProcessed.contains(currTB) && !currTA.equals(currTB)) {
setPairs.add(new UnDirectedPair(currTA, currTB));
}
}
setObjectsProcessed.add(currTA);
}
return setPairs;
}
Looking for a way to dramatically reduce the running time… ideas?
[BACKGROUND]
The set contains Persons. There are duplicates in the set (same persons, but with slightly different attributes because errors at input time). I have methods which take 2 Persons and make the necessary corrections. So, as a preliminary step, I need to create a Set of Pairs of (Person, Person) which will be fed to these methods.
Thanks for good suggestions.
The basic impairment was my class
UnDirectedPairwhich had expensiveequalsandcompareTomethods. I replaced it with a stripped bare Pair class.This got the code to run in approx 10s.
Still, using operations on sets seemed costly. With @mawia suggestion modified a bit, sets can be left completely out of the picture. The final code runs in under 2 seconds instead of 9mn 40s – returning a list of 19,471,920 Pair objects!!