For a project I’m working on, I display tweets I receive from the Twitter Streaming API. Before displaying a tweet, I need to check each word against a list of blacklisted words.
Currently, I have all the blacklisted words in a MongoDB collection.
The obvious way that comes to my mind is to explode the tweet to get each word, and then for each word in the tweet, check if the blacklist collection contains that word.
However, this would mean ~ 20 database calls per tweet I show.
Is there a better way to go about this?
I’d fetch all the blacklisted words from the database, store them inside a variable as a string (separated with
|) and usepreg_match()to see if there’s any in the tweet.