In R, I have data a vector of integers.
run <- sample.int(9, 1000, replace=T)
run[sample.int(1000, 100)] <- NA
If at least one of the following patterns, c(1, x, 1, y) or c(x, 1, y, 1) where x and y are either whole numbers or NA, is present, I would like to print out the start index of each pattern and update a count variable for each instance of a pattern. What is the most efficient way of doing this?
I was thinking of using the rle function and testing for every 4 consecutive values for a length of 1, and then testing whether they conform to one of the patterns. However, I am having problems with NAs with this approach since each NA is treated separately. Perhaps there is a better way to do this.
Taking your usage of
sample.intas implying your vector only contains values from1:9andNA, here’s a regular expressions approach:Then you would do second pattern similarly.