I want to add a list of dicts together with python multiprocessing module.
Here is a simplified version of my code:
#!/usr/bin/python2.7
# -*- coding: utf-8 -*-
import multiprocessing
import functools
import time
def merge(lock, d1, d2):
time.sleep(5) # some time consuming stuffs
with lock:
for key in d2.keys():
if d1.has_key(key):
d1[key] += d2[key]
else:
d1[key] = d2[key]
l = [{ x % 10 : x } for x in range(10000)]
lock = multiprocessing.Lock()
d = multiprocessing.Manager().dict()
partial_merge = functools.partial(merge, d1 = d, lock = lock)
pool_size = multiprocessing.cpu_count()
pool = multiprocessing.Pool(processes = pool_size)
pool.map(partial_merge, l)
pool.close()
pool.join()
print d
-
I get this error when running this script. How shall I resolve this?
RuntimeError: Lock objects should only be shared between processes through inheritance -
is the
lockinmergefunction needed in this condition? or python will take care of it? -
I think what’s
mapsupposed to do is to map something from one list to another list, not dump all things in one list to a single object. So is there a more elegant way to do such things?
The following should run cross-platform (i.e. on Windows, too) in both Python 2 and 3. It uses a process pool initializer to set the manager dict as a global in each child process.
FYI:
Pooldefaults to the CPU count.apply_asyncinstead ofmap.