Provided items is the result of a LINQ expression:
var items = from item in ItemsSource.RetrieveItems()
where ...
Suppose generation of each item takes some non-negligeble time.
Two modes of operation are possible:
-
Using
foreachwould allow to start working with items in the beginning of the collection much sooner than whose in the end become available. However if we wanted to later process the same collection again, we’ll have to copy save it:var storedItems = new List<Item>(); foreach(var item in items) { Process(item); storedItems.Add(item); } // Later foreach(var item in storedItems) { ProcessMore(item); }Because if we’d just made
foreach(... in items)thenItemsSource.RetrieveItems()would get called again. -
We could use
.ToList()right upfront, but that would force us wait for the last item to be retrieved before we could start processing the first one.
Question: Is there an IEnumerable implementation that would iterate first time like regular LINQ query result, but would materialize in process so that second foreach would iterate over stored values?
A fun challenge so I have to provide my own solution. So fun in fact that my solution now is in version 3. Version 2 was a simplification I made based on feedback from Servy. I then realized that my solution had huge drawback. If the first enumeration of the cached enumerable didn’t complete no caching would be done. Many LINQ extensions like
FirstandTakewill only enumerate enough of the enumerable to get the job done and I had to update to version 3 to make this work with caching.The question is about subsequent enumerations of the enumerable which does not involve concurrent access. Nevertheless I have decided to make my solution thread safe. It adds some complexity and a bit of overhead but should allow the solution to be used in all scenarios.
The extension is used like this (
sequenceis anIEnumerable<T>):There is slight leak if only part of the enumerable is enumerated (e.g.
cachedSequence.Take(2).ToList(). The enumerator that is used byToListwill be disposed but the underlying source enumerator is not disposed. This is because the first 2 items are cached and the source enumerator is kept alive should requests for subsequent items be made. In that case the source enumerator is only cleaned up when eligigble for garbage Collection (which will be the same time as the possibly large cache).