I have a data.frame called mydata and a vector ids containing indices of the columns in the data.frame that I would like to convert to factors. Now the following code solves the problem
for(i in ids) mydata[, i]<-as.factor(mydata[, i])
Now I wanted to clean this code up by using apply instead of an explicit for-loop.
mydata[, ids]<-apply(mydata[, ids], 2, as.factor)
However, the last statement gives me a data.frame where the types are character instead of factors. I fail to see the distinction between these two lines of code. Why do they not produce the same result?
Kind regards,
Michael
The result of
applyis a vector or array or list of values (see?apply).For your problem, you should use
lapplyinstead:Notice that this is one place where
lapplywill be much faster than aforloop. In general a loop and lapply will have similar performance, but the<-.data.frameoperation is very slow. By usinglapplyone avoids the<-operation in each iteration, and replaces it with a single assign. This is much faster.