I have the following data.table in R:
library(data.table)
DT = data.table(x=rep(c("b","a","c"),each=3), y=sample(rnorm(9)), v=1:9)
I just want to compute the minimum and the maximum by the column x and add these two new columns to DT. Here is my line for this:
DT[,c("e","d"):= list(min(y),max(y)), with=FALSE, by = x]
Error in `[.data.table`(DT, , `:=`(c("e", "d"), list(min(y), max(y))), :
'with' must be TRUE when 'by' or 'keyby' is provided
Nonetheless, if I write: DT[,c("e","d"):= list(min(y),max(y)), with=FALSE], I get this:
x y v e d
1: a -1.7125000 4 -1.7125 1.30553
2: a 1.0198038 5 -1.7125 1.30553
3: a 1.3055301 6 -1.7125 1.30553
4: b -0.9238759 1 -1.7125 1.30553
5: b 0.3077016 2 -1.7125 1.30553
6: b -1.2580845 3 -1.7125 1.30553
7: c -0.9399120 7 -1.7125 1.30553
8: c -0.1910583 8 -1.7125 1.30553
9: c 0.1239158 9 -1.7125 1.30553
As you can see, this is working but it’s not doing the task by x. I want to obtain something similar but e and d are supposed to be computed by each value of the variable x. So, my question is: How can I solve this?
“
:=by group” (new in version 1.8.2) and “:=with multiple new columns” (new in version 1.7.8) are both relatively recent additions to data.table.“
:=by group with multiple new columns” just hasn’t (yet) been implemented.So for now, you can either do this (if you want a one-liner):
or this (if you want to minimize extra copying operations):