I’m looking for a large dataset of tweets that have geolocation data (from the U.S.).
Is there such a dataset available anywhere? I looked on infochimps, but didn’t see anything.
If not, what’s the best way to generate this dataset myself? Should I just run the Twitter Streaming API on my local machine (or maybe on AWS?), and then filter and save all geo-tagged tweets?
Streaming API would probably be your best bet. Just use the
locationfilter to set the geographic area you want to capture data from.Here’s a slightly related question: Requesting just geotagged statuses from the Twitter API