So far this is my solution. I wonder if there is some more elegant/efficient way?
import datetime as dt
example = {dt.datetime(2008, 1, 1) : 5, dt.datetime(2008, 1, 2) : 6, dt.datetime(2008, 1, 3) : 7, dt.datetime(2008, 1, 4) : 9, dt.datetime(2008, 1, 5) : 12,
dt.datetime(2008, 1, 6) : 15, dt.datetime(2008, 1, 7) : 20, dt.datetime(2008, 1, 8) : 22, dt.datetime(2008, 1, 9) : 25, dt.datetime(2008, 1, 10) : 35}
def calculateMovingAverage(prices, period):
#calculates the moving average between each datapoint and two days before (usually 3! datapoints included)
average_dict = {}
for price in prices:
pricepoints = [prices[x] for x in prices.keys() if price - dt.timedelta(period) <= x <= price]
average = reduce(lambda x, y: x + y, pricepoints) / len(pricepoints)
average_dict[price] = average
return average_dict
print calculateMovingAverage(example, 2)
I am not sure, if I should use list-comprehension here.
There is probably some function for this somewhere, but I didn’t find it.
If you’re looking for other interesting ways to solve the problem, here is an answer using itertools:
The ideas involved are:
Use a simple generator to make a date iterator that loops over consecutive days from the lowest to the highest.
Use itertools.tee to construct a pair of iterators over the oldest data and the newest data (the front of the data window and the back).
Keep a running sum in a variable s. On each iteration, update s by subtracting the oldest value and adding the newest value.
This solution is space efficient (it keeps no more than window values in memory) and it is time efficient, one addition and one subtraction for each day regardless of the size of the window.
Handle missing days by defaulting to zero. There are other strategies that could be used for missing days (like using the current moving average as a default or adjusting n up and down to reflect the number of actual data points in the window).