I am having troubles with the multiprocessing module. I am using a Pool of workers with its map method to concurrently analyze lots of files. Each time a file has been processed I would like to have a counter updated so that I can keep track of how many files remains to be processed. Here is sample code:
import os
import multiprocessing
counter = 0
def analyze(file):
# Analyze the file.
global counter
counter += 1
print counter
if __name__ == '__main__':
files = os.listdir('/some/directory')
pool = multiprocessing.Pool(4)
pool.map(analyze, files)
I cannot find a solution for this.
The problem is that the
countervariable is not shared between your processes: each separate process is creating it’s own local instance and incrementing that.See this section of the documentation for some techniques you can employ to share state between your processes. In your case you might want to share a
Valueinstance between your workersHere’s a working version of your example (with some dummy input data). Note it uses global values which I would really try to avoid in practice: