I recently created a small C# windows forms/LINQ to XML app in VS2010 that does exactly what it’s supposed to do, except for one thing: it adds “[]” to the end of the DOCTYPE tag, which apparently is causing files to be rejected from a legacy system. Here’s a before and after:
Before
<!DOCTYPE ichicsr SYSTEM "http://www.accessdata.fda.gov/xml/icsr-xml-v2.1.dtd">
After
<!DOCTYPE ichicsr SYSTEM "http://www.accessdata.fda.gov/xml/icsr-xml-v2.1.dtd"[]>
These characters get added after the file is saved within the program using the .Save function. The program allows selection of an .xml file, then “cleans” it by removing certain tags, then saves it. When the process begins, the files do not have the “[]” in the DOCTYPE. After saving, they do. Does LINQ to XML add these?
Is there any way to keep the program from adding these characters?
Evidently, when
XDocumentparses an XML document that contains a Document Type Declaration, an empty “internal subset” is automatically inserted if one doesn’t exist. (The internal subset is the part surrounded by[]in the<!DOCTYPE>).The result is well-formed XML. However, if your legacy system can’t handle it, you can remove the internal subset from the DTD by setting the
XDocumentType.InternalSubsetproperty tonull: