I’m using the XML parsing methods from System.Xml.Linq. I’ve been ignoring this problem for quite a while now but finally figured I should ask why this is.
If you try putting an open angle bracket < inside a string attribute, the parser will throw an exception because it thinks it’s opening a new tag. For example:
<Foo text="This is my <sample> text" />
Why can’t it handle this? Anyone who knows anything about parsers knows that this shouldn’t be a problem. The parser should understand it’s in the middle of an open string, and can treat this character as not special. Instead I have to escape these as < everywhere.
The only answer I can think of was that this is a conscious choice. The designers decided that in this situation, it was more likely an error that someone forgot to close a string and not that they wanted this character in the string. Is this hypothesis correct or is there a real technical reason behind this and I’m the one who doesn’t understand parsers? And is there anything I can do to not have to escape these characters?
This is an XML issue – the
<character is not valid inside an attribute.You should escape
<,&and"in attributes, as defined in the specification.Microsoft has implemented a parser that complies with the specification.