I have two data.frames in R, each indexed by date. One is coarser than the other and I would like to compare the data only along the coarser timescale.
To be more concrete let’s say one data.frame has time points DF1[a,b,c,...,x,y,z] and the other only has DF2[f,p,t], where p=="July 19, 1917". I wish to compare DF1[f,p,t] to DF2[f,p,t].
This isn’t syntactic but I want to do for each $i { DF_combined <- df1[$i] . df2[$i] if exists(df1[$i]); }. In other words, make a new data.frame that only contains every shared observation day.
I hope the question is clear. I’ve been looking at other SO answers for a couple of hours and haven’t found one that covers what I’m trying to do yet. Thanks in advance.
Here’s the solution to my problem, from start to finish.
Problem: Given records from my broker (not evenly spaced in time), put the time series of my net worth next to a time series of the S&P, for comparison in
R.Answer:
Notice that there is no header over the dates. That’s because time-series data types embed the time-value as an ordering index. (
class(GSPC)=[1] "xts" "zoo"wherezoois a data type totally ordered by an index, andxtsis a time series data-type that tolerates more than the restrictive nativetsdata type tolerates.)In the date
formatthere is a big difference between%Y(’87) and%y(1987), as well as between%m– months and%M– minutes. My broker wrote 10/23/2009.So did I do it right?
Yessss.
Finally, @Joshua Ulrich’s advice does the kind of merge I want:
The
right joincompares the dates only at the coarser scale (my data is always coarser than Yahoo’s).Last of all, to plot the results:
Many thanks to all the people who wrote all this open source software — and especially to those who wrote vignettes!