I have a Java program that needs to call the same external executable 6 times. This executable produces an output file and once all 6 runs are complete I “merge” these files together. I did just have a for-loop where I ran the code, waited for the first run of the external executable to end then I called it again, etc.
I found this highly time consuming, averaging 52.4s for it to run 6 times… I figured it would be pretty easy to speed up by running the external executable 6 times all at once, especially since they aren’t dependent on one another. I used ExecutorService and Runnable, etc. to achieve this.
With my current implementation, I shave about ~5s off my time, making it only ~11% faster.
Here is some (simplified) code that explains what I’m doing:
private final List<Callable<Object>> tasks = new ArrayList<Callable<Object>>();
....
private void setUpThreadsAndRun() {
ExecutorService executor = Executors.newFixedThreadPool(6);
for (int i = 0; i < 6; i++) {
//create the params object
tasks.add(Executors.callable(new RunThread(params)));
}
try {
executor.invokeAll(tasks);
} catch (InterruptedException ex) {
//uh-oh
}
executor.shutdown();
System.out.println("Finished all threads!");
}
private class RunThread implements Runnable {
public RunThread(ModelParams params) {
this.params = params;
}
@Override
public void run()
{
//NOTE: cmdarray is constructed from the params object
ProcessBuilder pb = new ProcessBuilder(cmdarray);
pb.directory(new File(location));
p = pb.start();
}
}
I’m hoping there is a more efficient way to do this…or maybe I’m “clogging” my computer’s resources by trying to run this process 6 times at once. This process does involve file I/O and writes files that are about 30mb in size.
The only time that forking the executable 6 times will earn a performance boost is if you have at least 6 CPU cores and your application is CPU bound — i.e. mostly doing processor operations. Since each application writes a 30mb file, it sounds like it is doing a large amount of IO and the applications are IO bound instead — limited by your hardware’s ability to service the IO requests.
To speed up your program, you might try 2 concurrent processes to see if you get an improvement. However, if you program is IO bound, then you will never get much of a speed improvement by forking multiple copies.