Suppose I have a list with observations:
foo <- list(c("C", "E", "A", "F"), c("B", "D", "B", "A", "C"), c("B",
"C", "C", "F", "A", "F"), c("D", "A", "A", "D", "D", "F", "B"
))
> foo
[[1]]
[1] "C" "E" "A" "F"
[[2]]
[1] "B" "D" "B" "A" "C"
[[3]]
[1] "B" "C" "C" "F" "A" "F"
[[4]]
[1] "D" "A" "A" "D" "D" "F" "B"
And a vector with each unique element:
vec <- LETTERS[1:6]
> vec
[1] "A" "B" "C" "D" "E" "F"
I want to obtain a data frame with the counts of each element of vec in each element of foo. I can do this with plyr in a very ugly unvectorized way:
> ldply(foo,function(x)sapply(vec,function(y)sum(y==x)))
A B C D E F
1 1 0 1 0 1 1
2 1 2 1 1 0 0
3 1 1 2 0 0 2
4 2 1 0 3 0 1
But that’s obviously slow. How can this be done faster? I know of table() but haven’t really figured out how to use it due to 0-counts in some of the elements of foo.
One solution (off the top of my head):