I have an application which forms all the possible pairs and then compare the pairs, but when I run the applicaion, it gave me exception:OutOfMemoryError:Java heap space when running the code. I tried -Xmx1500m, but the exception just kept coming.
the code for generating the pairs are as following
File file = ...;
final Map<Pair, Collection<Integer>> lineNumbersByPair = new HashMap<Pair, Collection<Integer>>();
/*
* Step 1: Read in the lines, one by one.
*/
Reader reader = new FileReader(file);
try {
BufferedReader bufferedReader = new BufferedReader(reader);
try {
String line;
int lineNumber = 0;
while ((line = bufferedReader.readLine()) != null) {
lineNumber++;
String[] tokens = line.split("\\s+");
int[] values = new int[tokens.length];
for (int i = 0; i < tokens.length; i++) {
values[i] = Integer.parseInt(tokens[i]);
}
for (int i = 0; i < values.length; i++) {
for (int j = i + 1; j < values.length; j++) {
Pair pair = new Pair(values[i], values[j]);
Collection<Integer> lineNumbers;
if (lineNumbersByPair.containsKey(pair)) {
lineNumbers = lineNumbersByPair.get(pair);
} else {
lineNumbers = new HashSet<Integer>();
lineNumbersByPair.put(pair, lineNumbers);
}
lineNumbers.add(lineNumber);
}
}
}
} finally {
bufferedReader.close();
}
} finally {
reader.close();
}
/*
* Step 2: Identify the unique pairs. Sort them according to how many lines they appear on (most number of lines to least number of lines).
*/
List<Pair> pairs = new ArrayList<Pair>(lineNumbersByPair.keySet());
Collections.sort(
pairs,
new Comparator<Pair>() {
@Override
public int compare(Pair pair1, Pair pair2) {
Integer count1 = lineNumbersByPair.get(pair1).size();
Integer count2 = lineNumbersByPair.get(pair2).size();
return count1.compareTo(count2);
}
}
);
Collections.reverse(pairs);
/*
* Step 3: Print the pairs and their line numbers.
*/
for (Pair pair : pairs) {
Collection<Integer> lineNumbers = lineNumbersByPair.get(pair);
if (lineNumbers.size() > 1) {
System.out.println(pair + " appears on the following lines: " + lineNumbers);
}
}
I am reading a file around 15mb and it contains like 20000lines of single numbers, and there are around 40 numbers per line, it forms all possible pairs of each line.
anyone has any idea about how to solve this? thanks
My math may be off, but this is probably why you’re still running out of space.
So 40 numbers per line, 20000 lines = 800000 numbers.800000 C 2 = 319999600000 combinations of numbers.
At 4 bytes an
int, and withPair<int, int>, each Pair is at least 8 bytes, and then you add it to your data structure.8 bytes * 319999600000 = 2+ terabytes.
After re-reading your problem, each line is separate from the next.
40 numbers per line => 40 C 2 = 780 combinations per line * 20000 lines = 15600000 possible unique pairs * 8 bytes per pair = 119 MB purely for
intin the worst case. Add to that the memory taken up by references since Java does not allow primitive types in a Collection.But after taking another look at your program, I have some suggestions:
Why are you mapping a
Pairto aSet<Integer>?If you are interested in just the number of occurrences of each
Pair, you do not need to keep track of the line numbers the pairs occur on – you only want to store the number of times it has shown up.So in this case you want to map
PairtoInteger. This could reduce the amount of memory you require.Do you care about ordering of the pair?
Your
forloop seems to indicate you do not care about ordering, i.e. the pair (30,45) is the same as the pair (45,30). If so, you should create yourPairbased on relative ordering in the pair. Perhaps making aPairbased on the smallest value first, so that every time you encountered two integers m and n, you would always create the pair asPair(m, n). Also see the next section abouthashCode()andequals().Have you implemented
Pair‘sint hashCode()andboolean equals(Object)methods?This could be the difference between an actual working program and a broken one.
In your case, you want
Pairobjects to test for logical equality as it is a custom class, so you will have to override and implement your ownequals(Object)method. You will also want to overridehashCode()as you must always do so when overridingequals().This is detailed in the excellent Effective Java, and here’s a sample of the chapter discussing this: http://java.sun.com/developer/Books/effectivejava/Chapter3.pdf