I’d like to query for metadata about XML, to help determine some XML’s structure. I have a 49 MB xml file that I just need to know the list of all properties and child tags and some basic information about them. Can I query this from the XML itself or do I have to laboriously go through it and find each element and property that can exist in it? There is no schema definition available.
Given some random XML like the following:
DECLARE @x xml
SET @x =
'<People>
<Person age="35">
<Name>Pete</Name>
<Phone>
<Mobile>555-555-1234</Mobile>
<Home>555-555-0001</Home>
</Phone>
</Person>
<Person age="40" height="70 inches">
<Name>Paul</Name>
<Phone>
<Mobile>555-555-4567</Mobile>
</Phone>
</Person>
<Person age="24">
<Name>Susan</Name>
<Phone>
<Home>555-555-2323</Home>
</Phone>
</Person>
</People>'
How would I query this to return something like the following? I don’t need a single recordset (though that would of course be nice). I would be quite content with having to query repeatedly to get different parts. I might have to see there’s a root People tag first, then query People and see the Person tag, then finally see the Name and Phone tags under that, and so on.
People maxcount=1
People.Person maxcount=3 [age maxlen=2 maxcount=3] [weight maxlen=9 maxcount=1]
Person.Name textnode maxcount=1 maxlen=5
Person.Name.Phone maxcount=1
Person.Name.Mobile textnode maxcount=1 maxlen=12
Person.Name.Home textnode maxcount=1 maxlen=12
This type of profiling is probably best done through structured program code. Just because the xml may be in a database doesn’t mean that the analysis of the xml has to be done there.