I would like to create a function that returns a numpy array if one is given, or a multi-dimensional numpy array if that is given. For example:
import numpy as np;
def running_average(data,windowSize):
dShape = np.shape(data);
if(len(dShape)==1):
raOut = np.zeros(len(data));
rSum = 0.0;
for row,value in enumerate(data):
if row<windowSize:
rSum+=float(value);
else:
rSum=rSum-data[row-windowSize]+value;
raOut[row]=rSum/windowSize;
else:
raOut = np.zeros(dShape);
for col in xrange(dShape[1]):
rSum=0.0;
for row,value in enumerate(data[:,col]):
if row<windowSize:
rSum+=float(value);
else:
rSum=rSum-data[row-windowSize,col]+value;
raOut[row,col]=rSum/windowSize;
return raOut;
But there must be a good test to do so I don’t have to essentially repeat myself in the if and the else statement.
I am newer to python, what is the prefferred method?
How about something like:
This will take the average on the last axis, if you wanted to get really clever you could have the function take an axis argument and take the average on an arbitrary axis.
UPDATE
I believe this version is consistent with your code above.
For your more general question, using numpys builtin functions, such as cumsum helps because they already do that, but if you do have to loop you can use A = np.zeros(A.shape) to get an array the same shape as the input and then use A[…, i] to always operate on the last dimension or A[…, i, :] to always operate on the second to last dimension and so on. Also sometimes people do data = np.roll(data, axis) to move axis to the beginning then you use A[i] to operative on the first dimension and move the axis back if you need to.
UPDATE 2:
I just remembered why the following is a very bad idea (at least in this case):
You should use this instead: