I have an app where I pull in tweets with a certain hash tag. When I find the hash tag the app automatically creates a user if they don’t exist. When the user logs in via Twitter, I want be able to present them with their friends which are also using the app. The problem is for Twitter users with a ton of friends there is a max response of 100 and I’d have to continue to hit the API to 10 times to get the users of someone with 1000 friends.
Also, when pulling the friends info, should I just cache the friends in an array and move to a matched array so I don’t have to hit the API again?
Given that most Twitter apps have a per hour limit on API calls you really should cache pretty much everything. Check the cache to see if you have the data first before pulling down any information.
If you are worried about how up-to-date the data is then put a time stamp in the cache. When you try to access something from the cache check whether the time difference to now is larger than some defined amount (depending on how fresh your data needs to be & how much you can keep hitting the server with requests) and if it is go and refresh the data.
This is a little like writing a good web crawler (which Jeff Atwood seems to suggest has only been done by Google). It is easy to write something that will attempt to pull down everything from the internet at once but it is more difficult to write something that will do it in a sustainable, manageable way.
Twitter have been sensible in forcing people to think through these issues by placing a “per-hour access count” on their API.