I have an xml document with the below structure. I’m writing a transformation where I’d like to output the text from node B but ignore the element C and text node “title”. Essentially I’d like to extract the text “text goes here” and output it in a new element with all the whitespace normalised. Can anybody help? The below is what I’ve tried so far.
Input Doc
<A>
<B>
<C>title</C>
text goes here
</B>
</A>
Required output doc
<d>text goes here</d>
Solution A:
<xsl:template match="B">
<d>
<xsl:copy-of select="./text()"/>
</d>
</xsl:template>
Problem: the whitespace between elements is preserved so I get something like this:
<d>
Text goes here
</d>
I also tried using a value-of statement (<xsl:value-of select="./text()"/>) in the template in solution A but this didn’t return any text at all. Is there something wrong with the statement?
I should mention that I have overridden the the default text handling template using the following: <xsl:template match="text()" />
Thanks
The reason
<xsl:value-of select="./text()"/>returned “nothing” is that./text()will return a node set consisting of all the immediate child text nodes of the current node. Thevalue-ofa node set is the string value of its first item, which in this case is the whitespace-only text node between the opening<B>and the opening<C>tags. The same applies to the next-most-obviousnormalize-space(text())because that again converts the node set to a string (the first node value) and then normalizes space in that string. Instead, you need to normalize each child text node individually:One thing to note about this though is that if you have input like
then you will get output of
with no space between the bits either side of the subtitle. If this is a problem you can use a trick like
to iterate over only those text node children that contain non-whitespace characters, and add a space before all but the first of them.