data.table objects now have a := operator. What makes this operator different from all other assignment operators? Also, what are its uses, how much faster is it, and when should it be avoided?
data.table objects now have a := operator. What makes this operator different from all
Share
Here is an example showing 10 minutes reduced to 1 second (from NEWS on homepage). It’s like subassigning to a
data.framebut doesn’t copy the entire table each time.Putting the
:=injlike that allows more idioms :and :
I can’t think of any reasons to avoid
:=! Other than, inside aforloop. Since:=appears insideDT[...], it comes with the small overhead of the[.data.tablemethod; e.g., S3 dispatch and checking for the presence and type of arguments such asi,by,nomatchetc. So for insideforloops, there is a low overhead, direct version of:=calledset. See?setfor more details and examples. The disadvantages ofsetinclude thatimust be row numbers (no binary search) and you can’t combine it withby. By making those restrictionssetcan reduce the overhead dramatically.