I have a 50 research subjects who provided reaction time data on six tasks. I stored each subject’s mean reaction time on a task as a .npy file (50 subjects * 6 tasks = 300 files), and I wish to get a group mean for each task. This would ideally result in six files for the group.
To put it another way, I want to populate a numpy array with individual .npy files, but I am a little lost on the best way to do this. I had the idea to initialize an empty array for each task, populate it with values for subjects, then get the mean.
subjects=range(1,51)
tasks=['a','b','c','d','e','f']
datalist=[]
for subject in subjects:
for task in tasks:
array=np.array(datalist)
f=np.load('%d/%s.npy' % (subject,event))
result=np.append(array,f)
mu=np.mean(result)
sav=np.save('%s' %(task),mu)
The result of this code is the last value in the series, indicating that the array is not populating correctly. Any ideas would be greatly appreciated!
You’re recreating a new array each time, since it is inside the for loop. In fact, you are creating
array, appending a single array to it, taking the mean, and then saving, for every single subject and every single task. You should instead organize it like this:ETA: Incidentally, there are much better ways to store this data than in 300 separate .npy files. Is there only one value for each subject for each task? In that case, why not just represent this a 50 by 6 numpy matrix?