Suppose I have a large matrix:
M <- matrix(rnorm(1e7),nrow=20)
Further suppose that each column represents a sample. Say I would like to apply t.test() to each column, is there a way to do this that is much faster than using apply()?
apply(M, 2, t.test)
It took slightly less than 2 minutes to run the analysis on my computer:
> system.time(invisible( apply(M, 2, t.test)))
user system elapsed
113.513 0.663 113.519
If you have a multicore machine there are some gains from using all the cores, for example using
mclapply.This mini-example shows that things go as we planned. Now scale up:
This is using 8 virtual cores. Your mileage may vary. Not a huge gain, but it comes from very little effort.
EDIT
If you only care about the t-statistic itself, extracting the corresponding field (
$statistic) makes things a bit faster, in particular in the multicore case:Or even faster, compute the t value directly
Then