I am trying to calculate a rolling mean using plyr. The data is at the industry-country-year, with repeated observations for each industry-country. The data is unbalanced, but most industry-countries have approximately 15 observations.
For example the data looks like this:
country ISIC Year Value
Algeria 1 1990 400
Algeria 1 1991 450
Algeria 1 1992 460
Algeria 2 1990 450
Algeria 2 1991 500
Algeria 2 1992 450
Argentina 1 1990 400
Argentina 1 1991 450
Argentina 1 1992 460
Argentina 2 1990 450
Argentina 2 1991 500
Argentina 2 1992 450
. . . .
. . . .
If I subset the data to a specific industry and country I am able to calculate the rolling mean like this
rollmean(subdata$Value, 3)
However, I’ve been unable to get it to work with plyr, so as to calculate the rolling mean for each industry-country group.
I’ve tried:
roll <- ddply(data, .(country, ISIC), summarize, rollmean(data$Value, 3))
a rolling mean necessarily shortens the data which part of why you get the error.
However, if you’re doing a rolling mean on 3 samples and your data only has 3 samples, you’re just calculating the mean:
UPDATED FOR COMMENTS:
To return the dates you can use the
na.padargument torollmean: