The dict ‘params’ in the object ‘test’ cannot be updated when using pp module. Who can tell me why this would happen? Just look at the codes:
import pp
class test(object):
params = {'n': None}
dic2 = {}
n = None
def __init__(self, i):
#won't change
self.params['n'] = i
#changed
self.n = i
self.dic2 = {i: i}
def run(self):
print self.n, self.params, self.dic2
job_server = pp.Server()
jobs = []
for i in xrange(10):
t = test(i)
#won't change
t.params['n'] = i
#changed
t.n = i
t.run()
jobs.append(job_server.submit(t.run))
[job() for job in jobs]
The results:
0 {'n': 0} {0: 0}
1 {'n': 1} {1: 1}
2 {'n': 2} {2: 2}
3 {'n': 3} {3: 3}
4 {'n': 4} {4: 4}
0 {'n': None} {0: 0}
1 {'n': None} {1: 1}
2 {'n': None} {2: 2}
3 {'n': None} {3: 3}
4 {'n': None} {4: 4}
as we can see when using pp, the “params[‘n’]” can not be updated. It’s a strange behaviour. How could this happen?
This is a common pitfall of multiprocessing modules.
When you do calls like
job_server.submit(t.run), the objecttis pickled, sent to a new process, unpickled,runis executed and the return value is pickled, sent back to the main process and unpickled.Now, class pickling is not really supported(see here). pickle just pickles the name and when unpickling it re-imports the module to obtain the class object.
Since the variables you are using are class variables, they get reinitialized when the module is reimported. The same would happen for functions’ attributes.
If you want to keep the values of those variables you must make them instance variables.