I am trying to run a jack-knife using Plyr. I have a large dataset (715 sites over 10 years). I have already calculated the Species Richness (count of all species present) in a square for each year but now I want to calculate new Richness values, after taking out one species at a time and have them all in one dataset.
Example data:
Site <- c(1,1,1,1,1,1)
Year <- c(96,96,96,97,97,97)
SpID <- c(1,2,3,1,2,3)
Count <- c(1,1,1,1,1,1)
data <- cbind(Site, Year, SpID)
So overall for Site 1 the species richness is 3 in both years. If I want recalculate this without one of the species it would now be 2.
I have tried using the following code:
foo<-function(z){
data2 <- subset(data, SpID != (z))
summaryBy(Count~ Year + Site,
data = data2,
FUN = function(x) { c(l = length(x)) } )
}
richall<- ddply(data,.(SpID),foo)
But I’m obviously making a mistake somewhere! Any thoughts?
With your example data and call to
ddply, this is what will happen:ddplywill find the different values in the SpID column of yourdataset (1, 2 and 3)
data.framefor each of these unique values.data.frames will hold only the rows for which theSpID is equal to that unique value (so: a
data.framewith the firstand fourth rows, one with the second and fifth and one with the third
and last rows)
data.framesone at a time as its first argument
So it is rather obvious now that this will not help in doing jack-knife. In fact I don’t see an obvious way of attaining that with
plyr. In this particular case you’re probably better off rigging your own with similar logic. Something like:You can then recombine your results with e.g.
?do.call.