I’ve a situation where I’ve an xml file filled with whitespaces.
<test> <level> <sub name="xyz">test</sub> </level> <test>
I need to remove the whitespace but not the whitespace with the element attribute name as that would make my tag as <subname>. I can recursively look for whitespace till I find < and remove those and if found > then not remove it. I wanted to know if it is possible to do this via a Regular Expression in Java.
If it’s really that simple, this should be quite enough:
I’ve used Perl syntax, but I guess it’s quite easy to convert it into any language you want.
Be aware, though, that there are several caveats (as always in cases like that).
For example, you won’t meet
<symbol inside XML elements – but it can happily live within PCDATA sections, and that regex ignores this nuance.UPDATE: the regex might be made even more concise with ‘look-ahead’ feature:
… but not all languages support that (Perl does, though )).