This piece of Java code prints the title, link and publication date of every item from the NYT’s World RSS. But for the NYT’s Science RSS it doesn’t print the link field. What is happening here?
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse( direccion );
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("/rss/channel/item");
NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nl.getLength(); i++) {
Node node = nl.item(i);
Node nodoTitulo = (Node) xpath.evaluate("title", node, XPathConstants.NODE);
System.out.println(nodoTitulo.getTextContent());
Node nodoLink = (Node) xpath.evaluate("link", node, XPathConstants.NODE);
System.out.println(nodoLink.getTextContent());
Node nodoFecha = (Node) xpath.evaluate("pubDate", node, XPathConstants.NODE);
System.out.println(nodoFecha.getTextContent());
System.out.println();
}
It’s a
namespaceissue.In the science RSS, you have
In the world RSS, you have
Your code is picking up the
<atmoic:link>node first.Add:
After you create the factory and before you create the builder and you should now be getting the link
And if you’re really interested, you can have a read of this for some more info