I am doing machine learning in java using GATE Learning. I have a huge data set of documents to learn from. While using netbeans, I was getting java heap space error. So I provided 1600MB in the -Xmx parameter. Now, I do not get the heap space error but it takes ample of time to run!! (runs for 90 mins and I had to stop the process since I lost my patience!).
I do not understand whether I should increase my RAM(currently 4GB) or upgrade my OS(currently XP SP3, I have heard vista and win 7 better utilize RAM and Processor) or upgrade my processor(currently Dual Core E5500 2.80 GHz)?
Please throw some insight into what I can do to make this process run faster!
Thanks Rishabh
Before you can answer what will make it run faster, you have to find the bottleneck.
I’m not very familiar with Windows, but there is some sort of system load monitoring widget, IIRC.
What I would do is as follows:
Then fix the one that is causing the problem.
Just for context, it’s not that unusual for ML algorithms to take a long time to run on large data sets. You can use the above approach to plot out the run time as the size of the input datasets increase, at least then you’ll know if your program would have stopped in 100 minutes or 100 centuries.