I’ve written the following code to illustrate the problem I’m seeing. I’m trying to use a Process.Manager.list() to keep track of a list and increment random indices of that list.
Each time there are 100 processes spawned, and each process increments a random index of the list by 1. Therefore, one would expect the SUM of the resulting list to be the same each time, correct? I get something between 203 and 205.
from multiprocessing import Process, Manager
import random
class MyProc(Process):
def __init__(self, A):
Process.__init__(self)
self.A = A
def run(self):
i = random.randint(0, len(self.A)-1)
self.A[i] = self.A[i] + 1
if __name__ == '__main__':
procs = []
M = Manager()
a = M.list(range(15))
print('A: {0}'.format(a))
print('sum(A) = {0}'.format(sum(a)))
for i in range(100):
procs.append(MyProc(a))
map(lambda x: x.start(), procs)
map(lambda x: x.join(), procs)
print('A: {0}'.format(a))
print('sum(A) = {0}'.format(sum(a)))
As millimoose points out, the problem here is a race condition occurring in
self.A[i] = self.A[i] + 1. By the timeself.A[i] + 1has been calculated,self.A[i]could have already been changed by another process.A possible solution to your problem is to your problem is to pass the index back to the parent, which then performs the addition.
Appending an element to an array is only one operation, thus the race condition is avoided.