Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8573305
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T19:13:06+00:00 2026-06-11T19:13:06+00:00

I have a data.table that contains some groups. I operate on each group and

  • 0

I have a data.table that contains some groups. I operate on each group and some groups return numbers, others return NA. For some reason data.table has trouble putting everything back together. Is this a bug or am I misunderstanding? Here is an example:

dtb <- data.table(a=1:10)
f <- function(x) {if (x==9) {return(NA)} else { return(x)}}
dtb[,f(a),by=a]

Error in `[.data.table`(dtb, , f(a), by = a) : 
  columns of j don't evaluate to consistent types for each group: result for group 9 has     column 1 type 'logical' but expecting type 'integer'

My understanding was that NA is compatible with numbers in R since clearly we can have a data.table that has NA values. I realize I can return NULL and that will work fine but the issue is with NA.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T19:13:08+00:00Added an answer on June 11, 2026 at 7:13 pm

    From ?NA

    NA is a logical constant of length 1 which contains a missing value indicator. NA can be coerced to any other vector type except raw. There are also constants NA_integer_, NA_real_, NA_complex_ and NA_character_ of the other atomic vector types which support missing values: all of these are reserved words in the R language.

    You will have to specify the correct type for your function to work –

    You can coerce within the function to match the type of x (note we need any for this to work for situations with more than 1 row in a subset!

    f <- function(x) {if any((x==9)) {return(as(NA, class(x)))} else { return(x)}}
    

    More data.table*ish* approach

    It might make more data.table sense to use set (or :=) to set / replace by reference.

    set(dtb, i = which(dtb[,a]==9), j = 'a', value=NA_integer_)
    

    Or := within [ using a vector scan for a==9

    dtb[a == 9, a := NA_integer_]
    

    Or := along with a binary search

    setkeyv(dtb, 'a')
    dtb[J(9), a := NA_integer_] 
    

    Useful to note

    If you use the := or set approaches, you don’t appear to need to specify the NA type

    Both the following will work

    dtb <- data.table(a=1:10)
    setkeyv(dtb,'a')
    dtb[a==9,a := NA]
    
    dtb <- data.table(a=1:10)
    setkeyv(dtb,'a')
    set(dtb, which(dtb[,a] == 9), 'a', NA)
    

    This gives a very useful error message that lets you know the reason and solution:

    Error in [.data.table(DTc, J(9), :=(a, NA)) :
    Type of RHS (‘logical’) must match LHS (‘integer’). To check and coerce would impact performance too much for the fastest cases. Either change the type of the target column, or coerce the RHS of := yourself (e.g. by using 1L instead of 1)


    Which is quickest

    with a reasonable large data.set where a is replaced in situ

    Replace in situ

    library(data.table)
    
    set.seed(1)
    n <- 1e+07
    DT <- data.table(a = sample(15, n, T))
    setkeyv(DT, "a")
    DTa <- copy(DT)
    DTb <- copy(DT)
    DTc <- copy(DT)
    DTd <- copy(DT)
    DTe <- copy(DT)
    
    f <- function(x) {
        if (any(x == 9)) {
            return(as(NA, class(x)))
        } else {
            return(x)
        }
    }
    
    system.time({DT[a == 9, `:=`(a, NA_integer_)]})
    ##    user  system elapsed 
    ##    0.95    0.24    1.20 
    system.time({DTa[a == 9, `:=`(a, NA)]})
    ##    user  system elapsed 
    ##    0.74    0.17    1.00 
    system.time({DTb[J(9), `:=`(a, NA_integer_)]})
    ##    user  system elapsed 
    ##    0.02    0.00    0.02 
    system.time({set(DTc, which(DTc[, a] == 9), j = "a", value = NA)})
    ##    user  system elapsed 
    ##    0.49    0.22    0.67 
    system.time({set(DTc, which(DTd[, a] == 9), j = "a", value = NA_integer_)})
    ##    user  system elapsed 
    ##    0.54    0.06    0.58 
    system.time({DTe[, `:=`(a, f(a)), by = a]})
    ##    user  system elapsed 
    ##    0.53    0.12    0.66 
    # The are all the same!
    all(identical(DT, DTa), identical(DT, DTb), identical(DT, DTc), identical(DT, 
        DTd), identical(DT, DTe))
    ## [1] TRUE
    

    Unsurprisingly the binary search approach is the fastest

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a table that contains some data given below. It uses a tree
I have a table that contains some meta data in an XML field. For
I have a table that contains some sensitive data that I would like to
I have a table of data that contains about 260 rows. Each of these
I have a table that contains some user data: user_id | guest_id | time_seen
I have a SQL Server table that contains some datetime data. I'm using a
I have a table that contains all the data about users . Users can
I have a datatable that contains rows of transaction data for multiple users. Each
I have some data in a table that looks roughly like the following: table
I have a table that stores data that has been entered regarding the amount

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.