I’m trying to determine whether a given feed is Atom based or RSS based.
Here’s my code:
public boolean isRSS(String URL) throws ParserConfigurationException, SAXException, IOException{
DocumentBuilder builder = DocumentBuilderFactory.newInstance()
.newDocumentBuilder();
Document doc = builder
.parse(URL);
return doc.getDocumentElement().getNodeName().equalsIgnoreCase() == "rss";
}
Is there a better way to do it? would it be better if I used a SAX Parser instead?
Sniffing content is one method. But note that atom uses namespaces, and you are creating a non namespace aware parser.
Note also that you cannot compare using equalsIgnorCase(), since XML element names are case sensitive.
Another method is to react on the Content-Type header, if it is available in a HTTP GET request. Content-Type for ATOM would be
application/atom+xmland for RSSapplication/rss+xml. I would suspect though, that not all RSS feed can be trusted to correctky set this header.A third option is to look at the URL suffix, e.g. .atom and .rss.
The last two methods are easily configurable if you are using Spring or JAX-RS