I have a dataframe for which I’ve calculated and added a difftime column:
name amount 1st_date 2nd_date days_out
JEAN 318.5 1971-02-16 1972-11-27 650 days
GREGORY 1518.5 <NA> <NA> NA days
JOHN 318.5 <NA> <NA> NA days
EDWARD 318.5 <NA> <NA> NA days
WALTER 518.5 1971-07-06 1975-03-14 1347 days
BARRY 1518.5 1971-11-09 1972-02-09 92 days
LARRY 518.5 1971-09-08 1972-02-09 154 days
HARRY 318.5 1971-09-16 1972-02-09 146 days
GARRY 1018.5 1971-10-26 1972-02-09 106 days
I want to break it out and take subtotals where days_out is 0-60, 61-90, 91-120, 121-180.
For some reason I can’t even reliably write bracket notation. I would expect
members[members$days_out<=120, ] to show just Barry and Garry, but I get a whole lot of lines like:
NA.1095 <NA> NA <NA> <NA> NA days
NA.1096 <NA> NA <NA> <NA> NA days
NA.1097 <NA> NA <NA> <NA> NA days
Those don’t exist in the original data. There’s no one without a name. What am I doing wrong here?
This is standard behavior for
<and other relational operators: when asked to evaluate whetherNAis less than (or greater than, or equal to, or …) some other number, they returnNA, rather thanTRUEorFALSE.Here’s an example that should make clear what is going on and point to a simple fix.
To see why all of those rows indexed by
NA‘s have row.names likeNA.1095,NA.1096, and so on, try this: