I’m wondering what the best practices are for storing a relational data structure in XML. Particulary, I am wondering about best practices for enforcing node order. For example, say I have three objects: School, Course, and Student, which are defined as follows:
class School { List<Course> Courses; List<Student> Students; } class Course { string Number; string Description; } class Student { string Name; List<Course> EnrolledIn; }
I would store such a data structure in XML like so:
<School> <Courses> <Course Number='ENGL 101' Description='English I' /> <Course Number='CHEM 102' Description='General Inorganic Chemistry' /> <Course Number='MATH 103' Description='Trigonometry' /> </Courses> <Students> <Student Name='Jack'> <EnrolledIn> <Course Number='CHEM 102' /> <Course Number='MATH 103' /> </EnrolledIn> </Student> <Student Name='Jill'> <EnrolledIn> <Course Number='ENGL 101' /> <Course Number='MATH 103' /> </EnrolledIn> </Student> </Students> </School>
With the XML ordered this way, I can parse Courses first. Then, when I parse Students, I can look up each Course listed in EnrolledIn (by its Number) in the School.Courses list. This will give me an object reference to add to the EnrolledIn list in Student. If Students, however, comes before Courses, such a lookup to get a object reference is not possible. (Since School.Courses has not yet been populated.)
So what are the best practices for storing relational data in XML? – Should I enforce that Courses must always come before Students? – Should I tolerate any ordering and create a stub Course object whenever I encounter one I have not yet seen? (To be expanded when the definition of the Course is eventually reached later.) – Is there some other way I should be persisting/loading my objects to/from XML? (I am currently implementing Save and Load methods on all my business objects and doing all this manually using System.Xml.XmlDocument and its associated classes.)
I am used to working with relational data out of SQL, but this is my first experience trying to store a non-trivial relational data structure in XML. Any advice you can provide as to how I should proceed would be greatly appreciated.
Don’t think in SQL or relational when working with XML, because there are no order constraints.
You can however query using XPath to any portion of the XML document at any time. You want the courses first, then ‘//Courses/Course’. You want the students enrollments next, then ‘//Students/Student/EnrolledIn/Course’.
The bottom line being… just because XML is stored in a file, don’t get caught thinking all your accesses are serial.
I posted a separate question, ‘Can XPath do a foreign key lookup across two subtrees of an XML?’, in order to clarify my position. The solution shows how you can use XPath to make relational queries against XML data.