I’m trying to find a faster way to run a function, which is looking for the median value for every given day in a time period. Is there a faster way than running Sapply in a for loop?
for(z in unique(as.factor(df$group))){
all[[z]]<- sapply(period, function(x) median(df[x == df$date & df$group==z, 'y']))
}
Sample data:
date<-as.Date("2011-11-01") +
runif( 1000,
max=as.integer(
as.Date( "2012-12-31") -
as.Date( "2011-11-01")))
period<-as.Date(min(df$date):max(df$date), origin = "1970-01-01")
df <- data.frame(date=date, y = rnorm(1000), group=factor(rep(letters[1:4], each=250)))
Here is a solution using base R function
tapplyUpdate. Judging by your comment above, you need one column for each group? That’s also a one-liner: