I have a multithread layout where there is a manager object and a lot of workers objects.
I have doubt in which layout is better to use:
1 – The workers run in a loop and ask for a “new job” to the manager
constantly after finished.
or
2 – The manager give new jobs to the workers after they finish each
job.
Are there any recommendations for this?
This is a question I have wrestled with many times. Each time I have chosen for the specific situation I am coding for. You should do the same.
However, to chose correctly you must study the two approaches carefully.
Consider a test case.
1. The workers are in control.
The
managerbecomes a queue of all of the files to process. You create a fixed number ofworkerthreads which request the next file from the manager and repeat until the list is exhausted.Consequences
You usually end up having to
synchronizeaccess to the queue.You can tinker with the number of workers to attain maximal throughput for your hardware architecture.
Sometimes you can dynamically adjust the number of workers depending on the current load but this can be tricky. If successful you can often achieve an exceptionally optimal solution.
2. The manager is in control.
The
managercreates a newCallablefor every file and adds it to anExecutorcontrolled thread pool.Consequences
Well … just about the same if you think about it. The only difference really is that the executor does the queueing.
There is less synchronization required (except of course internally in the
Executor).Dynamically adjusting the number of threads is not trivial but I expect one could subclass the
Executorto achieve this.In summary
The two architectures are very nearly the same. A number of threads process a sequence of items in parallel.
The differences are more in the dynamics and the footprint.
When the workers are in control, a known number of objects are present at any time. An extensive queue can build up but these would presumably be small objects. Work is done at a fixed and predicable pace. If the work starts to pile up you have to make a special effort to deal with it.
When the manager is in control there can be an explosion of workers, most of which are just sitting around waiting for the Executor. Essentially, the Executor becomes the manager and the Thread pool holds the workers.
I personally prefer the workers being in control. Mostly I suppose because given two essentially similar architectures I normally prefer the one with the most predictable footprint. I plan to experiment.