I have started using data.table. Indeed it is very fast and quite nice syntax. I am having trouble with dates. I like to use lubridate. In many of my data sets I have dates or dates and times and have used lubridate to manipulate them. Lubridate stores the instant as a POSIX class. I have seen answers here that create new variables for instance just to get the year eg. 2005. I do not like that. There are times that I will be analyzing by year and other times by quarter and other times by month and other times by durations. I would like to do something simple such as this
mydatatable[,length(medical.record.number),by=year(date.of.service)]
that should give me the number of patient encounters in a given year. The by function is not working.
Error in names(byval) = as.character(bysuborig) :
'names' attribute [2] must be the same length as the vector [1]
Can you please point me to vignettes where data.tables is used with dates and where manipulations and categorizations of those dates are done on the fly.
This uses one of the examples in the
help(IDateTime)page. It shows that you canc hange to syntax for theby=argument to a character value in the form ” = ” or (after @Matthew Dowle’s comment below) you can try to use the functional form that you were using (although I have not been able to get it to work myself. I did get the preferred form:by=list(wday=wday(idate))to work.) Note that the key creation assumes an IDateTime class since there is noidateoritimevariable. Those are attributes of the class