I have a medium size data set, and here is an example taken from the data set:
2011.2012
9/7
11/5
12/15
1/5
2/5
I’d like to convert this data into a time series format.
After converting them into characters from factors, I used the as.Dates function, but I encountered a glitch.
The results assume the missing year is the current year. My goal is to be able to convert the dates before 1/1 into year 2011, and those after 1/1 into year 2012. The data ranges between September 2011 and April 2012.
I’ve tried using origin and start, but to no avail. Here are my codes:
date1 <- as.character(2011.2012)
date1 <- as.Date(date1, format="%m/%d")
Here is what I came up with. I do not know that this code will always work, but it seems to work with the example data set I used. The code seems to handle >2 years and any day of the year.
The code cannot handle a year for which there are no data, but if year is not in the data set then such a gap probably could not be identified regardless.
Note also that this approach will fail with the following two dates: “1/30” and “3/1”, if the two dates are
from two consecutive years. That is because there is such a long gap between the two dates that there is no way
for the computer to realize the two dates do not come from the same year.
In other words, if there are very long gaps between two consecutive dates any approach is likely to fail without
additional information. If there is, for example, at least one date from every quarter or half year then I think both posted answers will work because the computer will be able to identify a decrease in consecutive months as indicating a new year.
Maybe both approaches will work if the longest gap between two consecutive dates is 11 months. Maybe a gap of 363 days would be okay if the code was modified to also check the day of the month for each of two consecutive dates.