The “apply” documentation mentions that “Where ‘X’ has named dimnames, it can be a character vector selecting dimension names.” I would like to use apply on a data.frame for only particular columns. Can I use the dimnames feature to do this?
I realize I can subset() X to only include the columns of interest, but I want to understand “named dimnames” better.
Below is some sample code:
> x <- data.frame(cbind(1,1:10))
> apply(x,2,sum)
X1 X2
10 55
> apply(x,c('X2'),sum)
Error in apply(x, c("X2"), sum) : 'X' must have named dimnames
> dimnames(x)
[[1]]
[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
[[2]]
[1] "X1" "X2"
> names(x)
[1] "X1" "X2"
> names(dimnames(x))
NULL
If I understand you correctly, you would like to use apply only on certain columns. This is not what named dimnames would accomplish. The apply function on a matrix or data.frame always applies to all the rows or all the columns. The named dimnames allows you to choose to use rows or columns by name instead of the “normal”
1and2:However if you have the column names you’d like to apply to, you could do it by first selecting only those columns:
Named dimnames is a side-effect of that dimnames are stored as a list in the “dimnames” attribute in a
matrixorarray. Each component of the list corresponds to one dimension and can be named. This is probably more useful for multidimensional arrays…For a
data.frame, there is no “dimnames” attribute. Adata.frameis essentially a list, so the list’s “names” attributes corresponds to the column names, and an extra “row.names” attribute corresponds to the row names. Because of this, there is no place to store the names of the dimnames (they could have an extra attribute for that of course, but they didn’t). When you call thedimnamesfunction on a data.frame, it simply creates a list from the “row.names” and “names” attributes.