I am using Python as a script language to do some data processing and call command-line tools for number crunching. I wish to run command-line tools in parallel since they are independent with each other. When one command-line tool is finished, I can collect its results from the output file. So I also need some synchronization mechanism to notify my main Python program that one task is finished so that the result could be parsed into my main program.
Currently, I use os.system(), which works fine for one-thread, but cannot be parallelized.
Thanks!
Use the
Poolobject from themultiprocessingmodule. You can then use e.g.Pool.map()to do parallel processing. An example would be my markphotos script (see below), where a function is called multiple times in parallel to each process a picture.