For each level of factor I need to extract values aggregated over all subsets of data.frame except the current one. For example, there is a several subjects doing a reaction time task during several days, and I need to compute mean reaction time for all subjects and all days, but not including the subject for whom the mean is computed. Currently, I do it like this:
library(lme4)
ddply(sleepstudy, .(Subject, Days), summarise,
avg_rt = mean(sleepstudy[sleepstudy$Subject != Subject &
sleepstudy$Days == Days,"Reaction"]), .progress="text")
It works fine for small data sets, but for large ones it can be very slow. Is there a way to do it faster?
For really large datasets the speed gain should be more pronounced. I just couldn’t compare with larger datasets since
ddplyis so slow.