I have a very long vector of single characters i.e. somechars<-c("A","B","C","A"...) (length is somewhere in the millions)
what is the fastest way I can count the total occurrences of say “A” and “B” in this vector?
I have tried using grep and lapply but they all take so long to execute.
My current solution is:
tmp<-table(somechars)
sum(tmp["A"],tmp["B"])
But this still takes a while to compute. Is there some faster way I can be doing this? Or are there any packages I can be using to that does this already faster? I’ve looked into the stringr package but they use a simple grep.
I thought that this would be fastest…
And, it is faster than…
But not faster than…
But this is qualified by how many comparisons you make… which brings me back to my first guess. Once you want to sum more than 2 letters using the %in% version is the fastest.