I need to save in real-time to a database all tweets from the Twitter Streaming API, filtering them by out a certain list of words, of course. I’ve achieved it by using tweetstream, defining the list words like this before calling FilterStream():
words = ["word1","two words","anotherWord"]
What I’d like to do, is to be able to add/change/remove any of those values, without stoping the script. To do so, I created a plain text file containing the words I want to be filtered out separated by a line break. Using this code I get the list words just perfectly:
file = open('words.txt','r')
words = file.read().split("\n")
I made those lines work when it starts, but I need it to do it every time it’s going to check the stream. Any ideas?
You could read an updated word list in one thread and process tweets in another one using
Queuefor communication.Example:
Thread that reads tweets:
Thread that reads words:
The main script could look like:
Instead of polling you could use
inotifyor similar to monitor changes to the'words.txt'file.