So i’m trying to learn some xml parsing here, and I’m getting the hang of it, but for whatever reason, I seem to have to tack on “text()” at the end of each query, otherwise I get null values returned to me. I don’t actually understand the function of this “text()” ending but I know it’s not necessary and I’m wondering why I can’t omit it. Please help! Here is my code:
import org.w3c.dom.*;
import javax.xml.xpath.*;
import javax.xml.parsers.*;
import java.io.IOException;
import org.xml.sax.SAXException;
public class ParseClass
{
public static void main(String[] args)
throws ParserConfigurationException, SAXException,
IOException, XPathExpressionException
{
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("C:\\Users\\Brandon\\Job\\XPath\\XPath_Sample_Stuff\\catalog.xml");
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("/catalog/book[author='Thurman, Paula']/title/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++)
{
System.out.println(nodes.item(i).getNodeValue());
}
}
}
PS. In case you didn’t notice. i’m using XPath and DOM for my parsing.
You’re calling
getNodeValueon your result, and as this docs show (see the table) it isnullfor a node of typeElement. When you usetext(), the returned set now contains nodes of typeText, so you get the results you wanted (i.e. the contents of the title element instead of the element itself).I’d also suggest seeing this for more info on the usage of
text()in xpath.And if you want to extract the text from your element, directly, you could use
getTextContentinstead ofgetNodeValue: