I am writing a java program that needs to process a lot of URLs.
Each URLs will run the following jobs IN ORDER: download, analyze, compress
Instead of having one single thread to do all the jobs at once per URL, I want each job to have a fixed amount of threads, so that all the jobs will have threads running concurrently at any given time.
For example, the download job will have multiple threads to fetch and download URLs, as soon as one of the URL is downloaded, it will pass it on to a thread in analyze job and as soon as it completes, it will then pass on to a thread in compress job, etc.
I am thinking of using the CompletionService in java, since it returns a result as soon as its finished, but I am not sure how it works, so far my code looks like this:
ExecutorService executor = Executors.newFixedThreadPool(3);
CompletionService<DownloadedItem> completionService = new ExecutorCompletionService<DownloadedItem>(executor);
//while list has URL do {
executor.submit(new DownloadJob(list.getNextURL());//submit to queue for download
//}
//while there is URL left do {
Future<DownloadedItem> downloadedItem = executor.take();//take the result as soon as it finish
//what to do here??
//}
My question is how do I move the downloaded item to the analyze job and do the work there without waiting for all the download jobs to complete? I am thinking of creating a CompletionService for each job, is that a viable method? If not, is there a better alternative way to solve this problem? Please provide examples.
Once you mention
IN ORDERany attempt to use separate threads for those in order tasks will only complicate the design of your system.In my opinion, your best shot is to have separate threads handle individual URLs at once. To do the 3 steps you can introduce another abstraction (like use 3 callables) but you still want to execute them sequentially in one thread. And no need for completion service.