I am trying to parse some XML that looks similar to this:
<document>
<headings>
Important heading stuff.
</headings>
<startGroup group="1" />
<startItem value="1" />Item one stuff<endItem />
<blockofdata>
<startItem value="2" />Item two stuff<endItem />
<startItem value="3" />Item three stuff<endItem />
</blockofdata>
<startItem value="4" />Item four stuff<endItem />
<endGroup />
<startGroup group="2" />
<startItem value="1" />Item one stuff<endItem />
<startItem value="2" />Item two stuff<endItem />
<startItem value="3" />Item three stuff<endItem />
<endGroup />
</document>
I cannot figure out a linq-to-xml statement to get what I want. I need to flatten the structure. So assuming the above XML, I would like to get a list of this POCO:
class Items
{
public int GroupNumber {get;set;} // group property of startGroup
public int ItemNumber {get;set;} // value property of startItem
public string ItemText {get;set;} // data between i
}
How do you write a linq-to-xml statement that would pull the data between the attributes into the above item while grabing the data from between startGroup/endGroup and the data between startItem/endItem? I have burned up several hours on this and am about to just switch to using a XML stream reader and parsing it the old fashioned way.
The key here is to use the
ElementsAfterSelf()andNodesAfterSelf()methods to grab the sibling nodes along with theTakeWhile()predicate to stop enumerating at the appropriate times.First the helper methods:
And the magic query.
Now I hope you did not design this XML yourself… this is the kind of stuff that can really push someone over the edge. 😉