I currently lag panel data using data.table in the following manner:
require(data.table)
x <- data.table(id=1:10, t=rep(1:10, each=10), v=1:100)
setkey(x, id, t) #so that things are in increasing order
x[,lag_v:=c(NA, v[1:(length(v)-1)]),by=id]
I am wondering if there is a better way to do this? I had found something online about cross-join, which makes sense. However, a cross-join would generate a fairly large data.table for a large dataset so I am hesitant to use it.
I’m not sure this is that much different from your approach, but you can use the fact that
xis keyed byidI have not tested whether this is faster than
by, especially if it is already keyed.Or, using the fact that
t(don’t use functions as variable names!) is the time idbut again, using a double join here seems inefficient