Suppose I want to write a program to read movie info from IMDb, or music info from last.fm, or weather info from weather.com etc., just reading the webpage and parsing it is quiet tedious. Often websites have an xml feed (such as last.fm) set up exactly for this.
Is there a particular link/standard that websites follow for this feed? Such as robot.txt, is there a similar standard for information feeds, or does each website have its own standard?
This is the kind of problem RSS or Atom feeds were designed for, so look for a link for an RSS feed if there is one. They’re both designed to be simple to parse too. That’s normally on sites that have regularly updated content though, like news or blogs. If you’re lucky, they’ll provide many different RSS feeds for different aspects of the site (the way Stackoverflow does for questions, for instance)
Otherwise, the site may have an API you can use to get the data (like Facebook, Twitter, Google services etc). Failing that, you’ll have to resort to screen-scraping and the possible copyright and legal implications that are involved with that.