I’m using Web-Harvest to scrap a website and generate xml file with data.
I’m having ugly nodes like <name> </name>, using normalize-space() didn’t help so I opened the file in Hex view, and I found it corresponds to ‘c2a0’. I looked arround for a solution, but no one helped…
To sum up, what I want is to remove that weird space (using xquery or xpath1/2), so I can get an empty node <name/>
ps: the used encoding is ‘iso-8859-1’
You can use
translateto remove certain characters. And utf8 c2a0 is the character U+00A0, hexadecimal 0xA0 is 160, so you can usecodepoints-to-string(160)to get a string with the space.Together: