I am using java to read data from file, copy the data to smaller arrays and put these arrays in Hashtables. I noticed that Hashmap consumes more memory (about double) than what is in the original file! Any idea why?
Here is my code:
public static void main(final String[] args) throws IOException {
final PrintWriter writer = new PrintWriter(new FileWriter("test.txt",
true));
for(int i = 0; i < 1000000; i++)
writer.println("This is just a dummy text!");
writer.close();
final BufferedReader reader = new BufferedReader(new FileReader(
"test.txt"));
final HashMap<Integer, String> testMap = new HashMap<Integer, String>();
String line = reader.readLine();
int k = 0;
while(line != null) {
testMap.put(k, line);
k++;
line = reader.readLine();
}
}
A map is an “extendable” structure – when it reaches its capacity it gets resized. So it is possible that say 40% of the space used by your map is actually empty. If you know how many entries will be in your map, you can use the ad hoc constructors to size your map in an optimal way:
Even if you do that, the map will still use more space than the actual size of the contained items.
In more details: HashMap’s size gets doubled when it reaches (capacity * loadFactor). Default load factor for a HashMap is 0.75.
Example:
EDIT
This simple code gives you an idea of what happens in practice – the output is:
which shows that if the last item you add happens to force the map to resize, it can artificially increase the size of your map. Admittedly, that does not account for the whole effect that you are observing.