I have a python program, at a point it calls an external program (foo). This external program needs to be run several times. The exact number of times (num_pros) is variable and depends on the input.
Because this external program is by far the most time consuming part of my Python program I would like to take advantage of multiple cores processors to run several instances of the external program at the same time.
I came with the following solution that take into account that num_pros is unknown a priori and that the solution should be adaptable to any number of cores.
cores=2
proc_list=[]
for i in range(0,num_pros):
proc=Popen(['foo'], stdin=PIPE)
proc_list.append(proc)
if i%cores == cores-1:
for process in proc_list:
process.wait()
I have two questions:
There is a better (more efficient or pythonic) solution?
This code reduce the execution time only when the cores are real. Is this a hardware issue? Or something that could be fixed using python?
To clarify the second question let me provide an example.
In my notebook (running linux) the comnand ‘cat /proc/cpuinfo | grep processor | wc -l’ indicates the existence of 4 processor if I use cores=2 in my code I get the results in half the time (as expected), but when using cores=3 or cores=4 I get the same performance that when using cores=2. I have an Intel core I3 (2 cores and 4 threads) hence I guess that the problem is that only 2 cores are real (I test the code in other computer/processor I get the same result only real cores seems to be useful).
I think
multiprocessingis more intended for the case where the work you want to farm out is in python, not a totally different process. It’s all about usingforkand passing stuff from python process to python process, so I don’t think it will work for you.In your current implementation, once the max number of subprocesses is spawned, your code is blocking the spawning of new subprocesses until all the current batch of processes complete because
Popen.wait()blocks until that specific subprocess completes.I think what you want is
os.wait(). I’ve done something very similar by keeping a mapping of mysubprocess.Popeninstances mapped by pid. Just spin up your max number of subprocesses and then letos.wait()tell you when one of them finishes.os.wait()will give you the pid of whateverPopeninstances completes next and you can use that to do any remaining cleanup for that subprocess. Then you let your code spin up the next subprocess.