I have a python program, at a point it calls an external program (foo).

Question

0

Asked: May 25, 20262026-05-25T05:52:01+00:00 2026-05-25T05:52:01+00:00

I have a python program, at a point it calls an external program (foo).

0

I have a python program, at a point it calls an external program (foo). This external program needs to be run several times. The exact number of times (num_pros) is variable and depends on the input.
Because this external program is by far the most time consuming part of my Python program I would like to take advantage of multiple cores processors to run several instances of the external program at the same time.

I came with the following solution that take into account that num_pros is unknown a priori and that the solution should be adaptable to any number of cores.

cores=2
proc_list=[]
for i in range(0,num_pros):
    proc=Popen(['foo'], stdin=PIPE)
    proc_list.append(proc)
    if i%cores == cores-1: 
        for process in proc_list:
            process.wait()

I have two questions:

There is a better (more efficient or pythonic) solution?

This code reduce the execution time only when the cores are real. Is this a hardware issue? Or something that could be fixed using python?

To clarify the second question let me provide an example.
In my notebook (running linux) the comnand ‘cat /proc/cpuinfo | grep processor | wc -l’ indicates the existence of 4 processor if I use cores=2 in my code I get the results in half the time (as expected), but when using cores=3 or cores=4 I get the same performance that when using cores=2. I have an Intel core I3 (2 cores and 4 threads) hence I guess that the problem is that only 2 cores are real (I test the code in other computer/processor I get the same result only real cores seems to be useful).

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T05:52:01+00:00

I think multiprocessing is more intended for the case where the work you want to farm out is in python, not a totally different process. It’s all about using fork and passing stuff from python process to python process, so I don’t think it will work for you.

In your current implementation, once the max number of subprocesses is spawned, your code is blocking the spawning of new subprocesses until all the current batch of processes complete because Popen.wait() blocks until that specific subprocess completes.

I think what you want is os.wait(). I’ve done something very similar by keeping a mapping of my subprocess.Popen instances mapped by pid. Just spin up your max number of subprocesses and then let os.wait() tell you when one of them finishes. os.wait() will give you the pid of whatever Popen instances completes next and you can use that to do any remaining cleanup for that subprocess. Then you let your code spin up the next subprocess.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a python program, at a point it calls an external program (foo).

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply