I am attempting to write a local set of wrapper classes into our institution API (I work at a post secondary institution). The purpose of these classes are to securely pull transcripts from a remote service, and to allow the abstraction away from how that service works to our programmers. How the service works is confidential however the question I need an answer too is this:
How to deal with this when each transcript response comes in a different xml format depending on which of the schools it comes from. There are over 30.
As an example: Institution A has the tag, at the top of the document near the root, for GPA of a student to be |GPA|4.0|/GPA| whereas another institution might have it in a completely different part of the XML, near the bottom and perhaps 3 children deep, and name the tag |GradePointAverage|4.0|/GradePointAverage| (Pretend | is xml angle brackets)
Any suggestions how to deal with this lack of standardization?
It sounds like you should aim for one common data model, and then 30 different classes which are able to deserialize from XML to that data model. Depending on exactly how different they are, there may be significant aspects of reuse, and you may even be able to parameterize some differences. Using LINQ to XML makes it reasonably easy to parse any one format.
I would aim for lots of simple code rather than a small amount of “clever” code: parsing each individual format should be reasonably straightforward, and hopefully easy to test. Yes, it’ll be tedious to write this code, but it should end up being easy to follow, and easy to add more formats if you need to.
You could use XSLT to perform a transformation into a single format of course, but personally I’d rather write C# 🙂
This assumes you can create a common data model – if the formats are very different, you may find that you can’t accurately represent the data in each file without having a horrible lowest-common-denominator. Coming up with a good data model is likely to be as hard as writing each individual parser.