I have the following xml document:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<data>
<child1> Well, some spaces and nbsps  </child1>
<child2>  some more   or whatever </child2>
<child3> a nice text</child3>
<child4>how to get rid of all the nasty spaces  ? </child4>
</data>
</root>
I have to remove all non-breakable spaces, concatenate the text and nomalize it.
My xpath query (it works fine for concatenation and normalization – I have inserted the replacement with ‘x’ only for test purposes):
normalize-space(replace(string-join(//data/*,' '),' ','x'))
My problem: I can’t find the " "-whitespace to replace it.
Looking forward to your answers,
The string value of an element node is defined to be the concatenation of all its descendant text nodes, so in an XSLT transformation
would do what you require, assuming your document only contains one
dataelement – if there is more than onedataelement then this expression will only extract and normalize the text of the firstdataelement in the document.If you are using the XPath expression somewhere other than in an XSLT file then you will need to represent the non-break space character differently. The above example works because the XML parser converts the
 character reference into a non-break space character when reading the.xslfile, so the XPath expression parser sees the character, not the reference. In Java, for example, I could saybecause
\u00A0is the way to represent the nbsp character in a Java string literal. If you are using another language you need to find the right way to represent this character in that language, or if you’re using XPath 2.0 you could use thecodepoints-to-stringfunction: