I am working on an application in which, i have to read XML files that have a different set of nodes each time, although only a certain number of nodes appear in all the files, the combination in which they appear keep on changing, the XML files are generated by another system which i cannot control, I am looking into Linq to XML and XML serialization, but i guess serialization is not a choice since it needs pre-built classes to create objects.
Example XML data
<Employee>
<PersonalInfo>
<FirstName>Vamsi</FirstName>
<LastName>Krishna</LastName>
</PersonalInfo>
<EmploymentInfo>
<Department>
<Id>101</Id>
<Position>SD</Position>
<Department>
<EmploymentInfo>
</Employee>
Another Format
<Employee>
<PersonalInfo>
<FirstName>Vamsi</FirstName>
<LastName>Krishna</LastName>
</PersonalInfo>
</Employee>
You can observe that EmploymentInfo node is completely missing in the second example, there are many number of combinations in which the XML data can be presented to the application, I have to read the XML file validate it insert into an SQL Server database through my C# code.
I’d say it depends.
If you just want to communicate with another system in a strongly-typed way, and you can expect the XML schemas to not be changing very frequently, you might be OK with XML serialization. Just encapsulate the deserialization into a separate component and write different versions of them (yes, you’ll need to be able to determine the schema version that is currently used). I mean, each version would have it’s own set of classes that are targeted by the serializer.
But if you really cannot infer a system out of the schemas used by the external app and need some intelligent parser, you’d better use XPath or Linq to XML or some other XML-level APIs to manually handle the XML-s.
BTW, both of your samples are pretty easy for the
XMLSerializer. In the second case it will just setEmploymentInfoto null.