I have a data.table that contains some groups. I operate on each group and some groups return numbers, others return NA. For some reason data.table has trouble putting everything back together. Is this a bug or am I misunderstanding? Here is an example:
dtb <- data.table(a=1:10)
f <- function(x) {if (x==9) {return(NA)} else { return(x)}}
dtb[,f(a),by=a]
Error in `[.data.table`(dtb, , f(a), by = a) :
columns of j don't evaluate to consistent types for each group: result for group 9 has column 1 type 'logical' but expecting type 'integer'
My understanding was that NA is compatible with numbers in R since clearly we can have a data.table that has NA values. I realize I can return NULL and that will work fine but the issue is with NA.
From
?NAYou will have to specify the correct type for your function to work –
You can coerce within the function to match the type of
x(note we needanyfor this to work for situations with more than 1 row in a subset!More data.table*ish* approach
It might make more data.table sense to use
set(or:=) to set / replace by reference.Or
:=within[using a vector scan fora==9Or
:=along with a binary searchUseful to note
If you use the
:=orsetapproaches, you don’t appear to need to specify theNAtypeBoth the following will work
This gives a very useful error message that lets you know the reason and solution:
Which is quickest
with a reasonable large data.set where
ais replaced in situReplace in situ
Unsurprisingly the binary search approach is the fastest