I’m using TweetStream (https://github.com/joshmarshall/TweetStream), a tornado based twitter streaming module to monitor the stream api.
I would like to know how can I restart the fetch process if want to change the tracked words.
My current solution (not exactly a solution) is giving me some errors.
stream = tweetstream.TweetStream(configuration,ioloop=main_io_loop)
stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback)
def check_words():
global words
with open('words.txt') as file:
newwords = file.read()
if words != newwords:
words = newwords
try:
print newwords
stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback)
except:
pass
file.close()
interval_ms = 1000*10
scheduler = tornado.ioloop.PeriodicCallback(check_words,interval_ms,io_loop = main_io_loop)
scheduler.start()
main_io_loop.start()
Here is the error i’m getting
ERROR:root:Uncaught exception, closing connection.
Traceback (most recent call last):
File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 305, in wrapper
callback(*args)
File "/home/user/PycharmProjects/observrenv/src/tweetstream/tweetstream.py", line 155, in on_connect
self._twitter_stream.read_until("\r\n\r\n", self.on_headers)
File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 151, in read_until
self._set_read_callback(callback)
File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 369, in _set_read_callback
assert not self._read_callback, "Already reading"
AssertionError: Already reading
ERROR:root:Exception in callback <tornado.stack_context._StackContextWrapper object at 0x2415cb0>
Traceback (most recent call last):
File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/ioloop.py", line 421, in _run_callback
callback()
File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 305, in wrapper
callback(*args)
File "/home/user/PycharmProjects/observrenv/src/tweetstream/tweetstream.py", line 155, in on_connect
self._twitter_stream.read_until("\r\n\r\n", self.on_headers)
File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 151, in read_until
self._set_read_callback(callback)
File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 369, in _set_read_callback
assert not self._read_callback, "Already reading"
AssertionError: Already reading
I achieved better results (no the best) by starting the ioloop again when calling check_words.
stream = tweetstream.TweetStream(configuration,ioloop=main_io_loop)
stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback)
def check_words():
global words, stream
with open('words.txt') as file:
newwords = file.read()
if words != newwords:
words = newwords
print newwords
try:
stream = tweetstream.TweetStream(configuration,ioloop=main_io_loop)
stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback)
interval_ms = 1000*10
scheduler = tornado.ioloop.PeriodicCallback(check_words,interval_ms,io_loop = main_io_loop)
scheduler.start()
main_io_loop.start()
except:
pass
file.close()
interval_ms = 1000*10
scheduler = tornado.ioloop.PeriodicCallback(check_words,interval_ms,io_loop = main_io_loop)
scheduler.start()
main_io_loop.start()
As it was said here by a twitter employee, the recommended is to do what I am already doing (but in a more moderated way). Just reconnect once a while if your query terms changed. Otherwise just keep the connection open. It’s also important to monitor errors that twitter might send it to you or you might be banned.