I have code that creates an XML document that is difficult to read in a basic text editor. I tried using transformer.setOutputProperty(OutputKeys.INDENT, "yes") which is much better but now when I read the XML back in I have all these annoying text nodes that weren’t there before. All these text nodes contain a newline character “\n”. Is there any way to exclude them when I read the XML back in without having to write code to parse and remove them on my own? Some sort of filter maybe?
EDIT
I checked into Daniel’s suggestion to setIgnoringElementContentWhitespace(true) but came across two problems:
- I have to put the DOMBuilderFactory into validating mode
- Validating mode requires a DTD – I don’t have a DTD, the program I am creating allows the user to create new tags on the fly…
So to complicate things a bit more, is there a way to do this without a DTD? Or is there a simple way to create the DTD when I am saving the XML file?
AFAIK do most XML parsers have an option to skip empty text nodes, like they always occur. Xerces does, at least. The feature is called
http://apache.org/xml/features/dom/include-ignorable-whitespace
and allows to disable it (its enabled by default, if I read it right). Description: