Imagine an inverted binary tree with nodes A, B, C, D, E, F on level 0. nodes G,H,I on level 1, node J on level 2, and node K on level 3.
Level 1: G = func(A,B), H = func(C,D), I = func(E,F)
Level 2: J = func(G,H)
Level 3: K = func(J,I).
Each pair of nodes on Level 0 must be processed in order, Each pair of nodes on Level 1 can be processed in any order but the result must on the next level must be processed as shown, and so forth until we end up with the final result, K.
The actual problem is a computational geometry problem in which a sequence of solids are fused together. A is adjacent to B which is adjacent to C, and so on. The resulting fuse of A and B (G) is adjacent to the fuse of C and D (H). The resulting fuse of J and I (K) is the final result. Thus you can’t fuse G and I since they are not adjacent. If the number of nodes on a level is not a power of 2, you end up with a dangling entity that must be processed one level further.
Since the fuse process is computationally expensive and memory intensive but very parallel, I would like to use the Python multiprocessing package and some form of queue. After calculating G = func(A,B), I would like to push the result G onto the queue for the subsequent J = func(G,H) computation. When the queue is empty, the last result is the final result. Keep in mind that the mp.queue will not necessarily produce results FIFO, since I = func(E,F) may finish before H = func(C,D)
I have come up with a few (bad) solutions but I’m sure there is an elegant solution just beyond my grasp. Suggestions?
I couldn’t come up with a smart design for a queue, but you can easily replace the queue with one more process, which in my example I called
WorkerManager. This process gathers results from allWorkerprocesses and starts new workers only if there are two adjacent data packs waiting to be processed. This way, you’ll never try to join non-adjacent results, so you can ignore “levels” and fire the computation of next pair as soon as it’s ready.For my example, I used a simple
Workerthat returns given data in parentheses, separated by a comma, but you could put any computation in there. In my case, final result was(((0, 1), (2, 3)), (4, 5))which means that the algorithm computed(0, 1)and(2, 3)before computing((0, 1), (2, 3))and then joined the result with(4, 5). I hope this is what you were looking for.