Many of the C# XML serialization examples here include code like
xml = xml.Substring(xml.IndexOf(Convert.ToChar(60)));
xml = xml.Substring(0, (xml.LastIndexOf(Convert.ToChar(62)) + 1));
I understand this is discarding any (nonprintable/invalid) characters around < and >, but why do these characters exist in the first place?
Assume UTF16 using Encoding.Unicode with an XmlTextWriter.
The UTF format is not really a player in this as much as the construction of the XmlTextWriter. If the XmlTextWriter is handed a StringReader containing your xml variable, then the problem would potentially exist in how the xml was originally read from disk.
Text files often include an encoding preamble called a BOM (Byte Order Mark). When read incorrectly, several ‘weird’ characters will appear before the content of the file.
I expect the code you have was a poor man’s attempt at removing the BOM from an incorrectly read text file.