I have a pool of processes that need to be executed. I would like to fully utilize the machine, so that all CPUs are executing processes. I do not want to over-subscribe the system, so what i really want is #executing_processes=#cpus at any given moment.
I also need to store the stdout,stderr and return code of each completed processes.
How can this be achieved in Python?
EDIT: by ‘process’ i mean a shell process.
If you are talking about your own, Python-implemented processes:
The
multiprocessingmodule gives you the ability to spawn multiple processes. In particular, it sounds like you would want to create multiprocessing.cpu_count numbers of processes, potentially in aPool.If you are talking about separate programs that you want to execute through the shell:
The
subprocessmodule will let you spawn processes through itsPopenclass, which has parameters forstdin,stdout,sterrthat accept file-like objects.Popen.returncodecan be used to check the return code.