I’m super-rusty in both R and regular expressions. I tried reading R’s regex help file but it didn’t help at all!
I have a dataframe with 3 columns:
- vocabulary, i.e., a list of the 500 most common words found in a corpus
- count, the number of time the word appeared, and
- probability, the count divided by the total of all word counts
The list is arranged from most to least common, so not in alphabetical order.
I need to pull out the entire row for all the words that start with the same letter. (I don’t need to loop thru all the alphabet, I’ll just need the results for one letter.)
I’m not just asking about regex but how to write it in R so I get the results in a new dataframe.
You can use
grep:Which will give :