We have an application which processes a queue of documents (basically all the documents found in an input directory). The documents are read in one by one and are then processed. The application is an obvious candidate for threading since the results from processing one document are completely independent from the results of processing any other document. The question I have is how to divide the work.
One obvious way to split the work is to count the number of documents in the queue, divide by the number of available processors and split the work accordingly (example, the queue has 100 documents and I have 4 available processors, I create 4 threads and feed 25 documents from the queue to each thread).
However, a coworker suggests that I can just spawn a thread for each document in the queue and let the java JVM sort it out. I don’t understand how this could work. I do get that the second method results in cleaner code, but is it just as efficient (or even more efficient) than the first method?
Any thoughts would be appreciated.
Elliott
You should use the great
ExecutorServiceclasses. Something like the following would work. You would submit each of your files to the thread-pool and they will be processed by the 10 working threads.