I’m using the twitter4j library to access the public twitter stream. I’m trying to make a project involving geotagged tweets, and I need to collect a large number of them for testing.
Right now I am getting the unfiltered stream from twitter and only saving tweets with geotags. This is slow though because the VAST majority of tweets don’t have geo tags. I want the twitter stream to send me only tweets with geotags.
I have tried using the method mentioned in this question, where you filter with a bounding box of size 360* by 180* but that’s not working for me. I’m not getting any errors when using that filter, but I’m still getting 99% of tweets with no geotags. Here is how I’m doing it:
ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setDebugEnabled(true)
.setOAuthConsumerKey("censored")
.setOAuthConsumerSecret("censored")
.setOAuthAccessToken("censored")
.setOAuthAccessTokenSecret("censored");
TwitterStream twitterStream = newTwitterStreamFactory(cb.build()).getInstance();
StatusListener listener = new MyStatusListener();
twitterStream.addListener(listener);
//add location filter for what I hope is the whole planet. Just trying to limit
//results to only things that are geotagged
FilterQuery locationFilter = new FilterQuery();
double[][] locations = {{-180.0d,-90.0d},{180.0d,90.0d}};
locationFilter.locations(locations);
twitterStream.filter(locationFilter);
twitterStream.sample();
Any suggestions about why I’m still getting tweets with no geotags?
Edit: I just reread the twitter4j javadoc on adding filters to a twitter stream, and it says “The default access level allows up to 200 track keywords, 400 follow userids and 10 1-degree location boxes.” So bounding boxes may only be 1 degree wide? That’s different from the original information I came across. Is that my problem? My filter request is too big so it’s being ignored? I’m not getting any errors when trying to use it.
getting from filter stream then overwriting it with sample stream.
remove the last line :
twitterStream.sample();