Consider the following data in long format
library(plyr)
library(reshape2)
x <- seq(0,2*pi,length=20)
ll <- ll2 <- list(a = data.frame(x=x, y=sin(x)),
b = data.frame(x=x, y=cos(x)))
m <- melt(ll, id="x")
m2 <- m[sample(nrow(m)),]
head(m)
# x variable value L1
# 0.0000000 y 0.0000000 a
# 0.3306940 y 0.3246995 a
# 0.6613879 y 0.6142127 a
# 0.9920819 y 0.8371665 a
# 1.3227759 y 0.9694003 a
# 1.6534698 y 0.9965845 a
m$L1 is nicely ordered, but m2$L1 is all over the place. Now, starting with this kind of data, I want to take the difference value[L1 == "b"] - value[L1 == "a"] for each value of x. The following lines illustrate the problem of using diff when m2$L1 is not ordered: the sign can be wrong. Is there a trick I could use to achieve the result of the last two ddply calls, but more elegantly?
res <- ddply(m, "x", summarize, difference = diff(value))
res <- ddply(m2, "x", summarize, difference = diff(value)) # fails, because L1 not ordered
res <- ddply(m2[order(m2$L1, m2$x), ], "x", summarize, difference = diff(value))
res <- ddply(m2, "x", function(d)
data.frame(difference = d$value[d$L1 == "b"] - d$value[d$L1 == "a"]))
plot(res) # visual check of the result
lines(x, cos(x) - sin(x) , col="red")
Doesn’t
dcastdo what you want?