The XML specification states that this has to be the behaviour for handling any “external parsed entity”. But this applies to CDATA sections inside of an element as well? Why?
Is there any way to get \r unconverted by adding 1/2 conditions in parser code, instead of changing \r to
The XML specification states that this has to be the behaviour for handling any
Share
This is indeed the case. Why? It is to simplify the life of the applications that will process the output from the XML file – they simply do not need to worry in which format the newlines are, increasing application compatibility (consider compatibility of simple text editors between Linux and Windows – they almost always display the files incorrectly, in Windows most often as a single line).
Of course, if you, for any reason, require having \r unconverted, it is simple to take any existing XML parser implementation and modify it. In tinyxml, you need to modify the TiXmlBase::ReadText() function, or you can grab an older version of it, as it used to leave whitespace intact.
On the other hand, from the design point of view, it would be much cleaner to just run parser output through character replacement function and replace all the “\n” to “\r\n”.
Of course the best would be to just use the output as is, right now I can’t imagine any scenario where this would be needed.