How to extract the text of such an element via XPath: <document> some text

Question

0

Asked: May 25, 20262026-05-25T18:51:34+00:00 2026-05-25T18:51:34+00:00

How to extract the text of such an element via XPath: <document> some text

0

How to extract the text of such an element via XPath:

<document>
  some text
     <subelement>subelement text</subelement>
  postscript
</document>

The XPath expression:

/document

returns document node text and all its subnodes text:

some text         subelement text    postscript

While the XPath expression:

/document/text()

returns just the first text node:

some text

that is, “postscript” is missing.

Question
Is there a way to get the text of all text nodes that are immediate sons of <document>?

Postscript
Very focused Example, in case you want to test yourself, copy into a main method and fix the imports.

    DocumentBuilder dbuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();

    String xml = "<?xml version='1.0' encoding='UTF-8'?>" +
                 "<document>"
                 + "some text into document"
                 + "    <subelement>"
                 + "        some text into SUBelement"
                 + "    </subelement>"
                 + "POSTSCRIPT"
                 + "</document>";

    //i'm forced to use an InputSource because parse doesn't take readers directly :-(
    Document doc = dbuilder.parse(new InputSource(new StringReader(xml)));

    //usual way to get an xpath
    XPath xp = XPathFactory.newInstance().newXPath();

    System.out.println(xp.evaluate("/document", doc));

    System.out.println(xp.evaluate("/document/text()",doc));

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T18:51:34+00:00

While the XPath expression:
/document/text()
returns just the first text node:
some text into document
that is, “postscript” is missing.

The XPath expression above returns all text node children of /document, but the XPath.evaluate() method, with no 3rd argument converts its result to a string.
In the process, it apparently acts like <xsl:value-of> in that it only converts the first node in the result node-set.

To print the value of all text node children, supply XPathConstants.NODESET as the 3rd argument to XPath.evaluate(). This will give you the nodeset of text nodes as a NodeList. Then you can loop through them and print each one. Or you could try passing the NodeList directly to println(), and see what it prints. 🙂

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

How to extract the text of such an element via XPath: <document> some text

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply