I was reading a piece of code from the “XStreamingReader” library (which seems like a really cool solution for being able to execute LINQ queries over XML documents but without loading the actual document into the memory (like in an XDocument object)
and was wondering about the following:
public IEnumerable<XElement> Elements()
{
using (var reader = readerFactory())
{
reader.MoveToContent();
MoveToNextElement(reader);
while (!reader.EOF)
{
yield return XElement.Load(reader.ReadSubtree());
MoveToNextFollowing(reader);
}
}
}
public IEnumerable<XElement> Elements(XName name)
{
return Elements().Where(x => x.Name == name);
}
Regarding the 2nd method Elements(XName) – The method first calls Elements(), and then use Where() to filter it’s results, but i’m kind of intrigued about the order of executions in here since Elements() contains a yield statement.
From what I understand:
– Executing Elements() returns an IEnumerable collection, this collection physically does not contain any items YET.
– Where() is executed on that collection, behind the scene there’s a loop which iterates through every item, new items are “Loaded” on the fly, since yield is being used.
– All items which matched the Where statement are returned as an IEnumerable collection, and are PHYSICALLY IN that collection.
First, am I correct with the above assumption?
Second, in case i’m right – what if I wanted to return a “yielded” collection rather than returning a collection which is filled up physically with all the filtered data?
I’m asking this because it loses the entire purpose of NOT reading an entire “matching” block into the memory, but iterating one matching element at a time…
I assume when you say that items are physically in a collection, you mean that there is a structure in memory that contains all the items right now. With
Where(), that’s not the case, it usesyieldtoo internally (or something that acts the same asyield).When you try to fetch the first item,
Where()iterates the source collection, until it finds the first item that matches. So, the elements are streamed both inElements()and inElements(XName)and the whole collection is never in memory, only piece by piece.