I want to use roll apply like functionality on non time series data but computed on a rolling window. So there is no need to convert it into zoo object and back again. Is there a way this can be done on a very large data set?
Edit
I am using
rollapply(zoo(SPYTS[, "Close"]), 2, function(x) x[1] + x[2], fill=0, align="right")
on 1 million data points. This takes never seams to stop calculating. Something like
SPYTS$LnReturns <- (rbind(0, as.data.frame(log(SPYTS[1:(nrow(SPYTS) - 1), "Close"] / SPYTS[2:nrow(SPYTS), "Close"]))))
just takes a few seconds.
The function function(x) x[1] + x[2] is just a place holder. The actual function I have in mind is slightly different.
This answer is an expanded version of my earlier comments which I have now deleted.
zoo’s
rollapplyalready supports plain vectors and matrices. Furthermore itsrollapplyroutine extracts the plain vectors or matrices from a zoo object before operating on it so there is no reason for a zoo object to take materially longer than a non-zoo object. The slowness you observed was a bug inrollapply(the extraction was not taking place properly) that was fixed in early November in the development version. This version is on R-Forge and installed like this:On the other hand, the generality of
rollapplymeans its going to be much slower than special purpose routines or vectorized operations.zoo does have some specialized versions of
rollapply(rollmean,rollmedian,rollmax) that are optimized for particular operations and will be much faster. If you can manufacture something out of those, e.g. a rolling sum of k terms is the same asktimes a rolling mean, then you can get substantial speedups. Faster still will be manufacturing the rolling result from plain operations such as+.The post indicated that the function in question was just an example but the particular function could make a big difference in terms of speed since it will affect whether the sorts of speedups discussed are available.
For example, running 3 replications of each of
rollapply,2 * rollmeanand a simple vectorized addition shows this: