I have an array of keywords, which can have a variable length. For this example imagine there are 50:
keywords = ['dog','cat','monkey'...'bird']
I have an array of sentences (again of a variable length) I want to loop through, searching for each of the keywords.
sentences = [ ['My dog ate cat food'], ['I went to the store.'], ... ]
If the sentence contains any of the keywords, then I’m moving it to a new “matched” array. So in Ruby, my code looks something like this:
sentences.each do |sentence|
keywords.each do |keyword|
if sentence.match(/\b#{keyword}\b/i)
matched << sentence
end
end
end
This takes quite a while and seems really inefficient–especially if I have a large keyword list and a large sentence list. I’m the first to admit my Ruby development isn’t that great yet–is there an easier, more efficient way to do this?
I’m using MongoDB to store the keywords and sentences. If there is a better method using the database, I’d love to explore it.
I’ve not used MonogDB before, but you can optimize your ruby code a bit. Since you only care if there is a match of any keyword in the sentence, I would push the logic into the Ruby regexp engine:
What that does is makes one regexp that combines all your keywords. That way you’re only looping over the sentences rather than each keyword.