I have an application that processes data stored in a number of files from an input directory and then produces some output depending on that data.
So far, the application works in a sequential basis, i.e. it launches a “manager” thread that
- Reads the contents of the input directory into a
File[]array - Processes each file in sequence and stores results
- Terminates when all files are processed
I would like to convert this into a multithreaded application, in which the “manager” thread
- Reads the contents of the input directory into a
File[]array - Launches a number of “processor” threads, each of which processes a single file, stores results and returns a summary report for that file to the “manager” thread
- Terminates when all files have been processed
The number of “processor” threads would be at most equal to the number of files, since they would be recycled via a ThreadPoolExecutor.
Any solution avoiding the use of join() or wait()/notify() would be preferrable.
Based on the above scenario:
- What would be the best way of having those “processor” threads reporting back to the “manager” thread? Would an implementation based on
CallableandFuturemake sense here? - How can the “manager” thread know when all “processor” threads are done, i.e. when all files have been processed?
- Is there a way of “timing” a processor thread and terminating it if it takes “too long” (i.e., it hasn’t returned a result despite the lapse of a pre-configured amount of time)?
Any pointers to, or examples of, (pseudo-)source code would be greatly appreciated.
You can definitely do this without using
join()orwait()/notify()yourself.You should take a look at java.util.concurrent.ExecutorCompletionService to start with.
The way I see it you should write the following classes:
FileSummary– Simple value object that holds the result of processing a single fileFileProcessor implements Callable<FileSummary>– The strategy for converting a file into a FileSummary resultFile Manager– The high level manager that creates FileProcessor instances, submits them to a work queue and then aggregates the results.The FileManager would then look something like this:
If you want to implement a timeout you can use the
poll()method on CompletionService instead oftake().