if xmpl is a list where each element has an integer age and a list data, where data contains three matrices of equal size, a to c
What is the best way to do
cor( xmpl[[:]]$data[[:]][c('a','b','c')], xmpl[[:]]$age)
where the results would be 3 x length(a) array or list that reflects age correlated with each instance of each element of a (row 1), b (row 2), and c (row 3) across xmpl.
I am reading in matrices that represent the output of different pipelines. There are 3 of these per subject and a whole lot of subjects. Currently, I’ve built a list of subjects that has among other things a list of pipeline matrices.
The structure looks like:
str(exmpl)
$ :List of 4
..$ id : int 5
..$ age : num 10
..$ data :List of 3
.. ..$ a: num [1:10, 1:10] 0.782 1.113 3.988 0.253 4.118 ...
.. ..$ b: num [1:10, 1:10] 5.25 5.31 5.28 5.43 5.13 ...
.. ..$ c: num [1:10, 1:10] 1.19e-05 5.64e-03 7.65e-01 1.65e-03 4.50e-01 ...
..$ otherdata: chr "ignorefornow"
#[...]
I want to correlate every element of a across all subjects with the age of subjects. Then do the same for b and c and put the results into a list.
I think I am approaching this in a way that is awkward for R. I’m interested in what the “R way” of storing and retrieving this data would be.
Data Structure and desired output http://dl.dropbox.com/u/56019781/linked/struct-2012-12-19.svg
library(plyr)
## example structure
xmpl.mat <- function(){ matrix(runif(100),nrow=10) }
xmpl.list <- function(x){ list( id=x, age=2*x, data=list( a=x*xmpl.mat(), b=x+xmpl.mat(), c=xmpl.mat()^x ), otherdata='ignorefornow' ) }
xmpl <- lapply( 1:5, xmpl.list )
## extract
ages <- laply(xmpl,'[[','age')
data <- llply(xmpl,'[[','data')
# to get the cor for one set of matrices is easy enough
# though it would be nice to do: a <- xmpl[[:]]$data$a
x.a <- sapply(data,'[[','a')
x.a.corr <- apply(x.a,1,cor,ages)
# ...
#xmpl.corr <- list(x.a.corr,x.b.corr,x.c.corr)
# and by loop, not R like?
xmpl.corr<-list()
for (i in 1:length(names(data[[1]])) ){
x <- sapply(data,'[[',i)
xmpl.corr[[i]] <- apply(x,1,cor,ages)
}
names(xmpl.corr) <- names(data[[1]])
Final output:
str(xmpl.corr)
List of 3
$ a: num [1:100] 0.712 -0.296 0.739 0.8 0.77 ...
$ b: num [1:100] 0.98 0.997 0.974 0.983 0.992 ...
$ c: num [1:100] -0.914 -0.399 -0.844 -0.339 -0.571 ..
Here’s a solution. It should be short enough.