I’m looking for a way to parse a URL that’s in the atom format, for example, the results shown here – http://search.twitter.com/search.atom?q=Stackoverflow&:)&since:2011-05-24&rpp=100&page=1
So far, I tried using the file_get_contents(); function, and saving this to a text document, but it’s only outputting in 21kb chunks (each time I re-run the script, it appends a new, extra 21kb onto the end of the existing file)
I need to be able to find the amount of times the string <published> occurs in the document (in order to find how many tweets are published on the page). Is there a function I can use to either search&count in the HTML of the URL directly, or one to save the HTML of the URL (the entirety of it, around 120kb) to a file locally, and then search&count that file?
All i can think of here is using SimpleXML to parse it, use Xpath to find just the published tags and then count the number of results from that Xpath. This is probably the way I’d do it but then again you could always use preg_match which does return the number of times your regex matches in the string