I am a new guy in R and really unsure how to filter data in date frame.
I have created a data frame with two columns including monthly date and corresponding temperature. It has a length of 324.
> head(Nino3.4_1974_2000)
Month_common Nino3.4_degree_1974_2000_plain
1 1974-01-15 -1.93025
2 1974-02-15 -1.73535
3 1974-03-15 -1.20040
4 1974-04-15 -1.00390
5 1974-05-15 -0.62550
6 1974-06-15 -0.36915
The filter rule is to select the temperature which are greater or equal to 0.5 degree. Also, it has to be at least continuously 5 months.
I have eliminate the data with less than 0.5 degree temperature (see below).
for (i in 1) {
el_nino=Nino3.4_1974_2000[which(Nino3.4_1974_2000$Nino3.4_degree_1974_2000_plain >= 0.5),]
}
> head(el_nino)
Month_common Nino3.4_degree_1974_2000_plain
32 1976-08-15 0.5192000
33 1976-09-15 0.8740000
34 1976-10-15 0.8864501
35 1976-11-15 0.8229501
36 1976-12-15 0.7336500
37 1977-01-15 0.9276500
However, i still need to extract continuously 5 months. I wish someone could help me out.
If you can always rely on the spacing being one month, then let’s temporarily discard the time information:
So, since every temperature in that vector is always separated by one month, we just have to look for runs where the
temps[i]>=0.5, and the run has to be at least 5 long.If we do the following:
we’ll have a vector
ofinterestwith valuesTRUE FALSE FALSE TRUE TRUE ....etc where it’sTRUEwhentemps[i]was >= 0.5 andFALSEotherwise.To rephrase your problem then, we just need to look for occurences of at least five
TRUEin a row.To do this we can use the function
rle.?rlegives:So we use
rlewhich counts up all the streaks of consecutiveTRUEin a row and consecutiveFALSEin a row, and look for at least 5TRUEin a row.I’ll just make up some data to demonstrate:
Now if you do
Nino3.4_1974_2000$Month_common[startMonths]you’ll get all the months in which the El Nino started.It boils down to just a few lines: