Is there a significant complexity difference between these two implementation or does the compiler optimize it anyway?
Usage:
for(int i = 0; i < int.MaxValue; i++)
{
foreach(var item in GoodItems)
{
if(DoSomethingBad(item))
break; // this is later added.
}
}
Implementation (1):
public IEnumerable<T> GoodItems
{
get { return _list.Where(x => x.IsGood); }
}
Implementation (2):
public IEnumerable<T> GoodItems
{
get { foreach(var item in _list.Where(x => x.IsGood)) yield return item; }
}
It appears that IEnumerable methods should always be implemented using (2)? When is one better than the other?
Internally, the first version gets compiled down to something that looks like this:
Whereas the second one will now look something like:
The
Whereclause in LINQ is implemented with deferred execution. So there’s no need to apply theforeach (...) yield return ...pattern. You’re making more work for yourself, and potentially for the runtime.I don’t know if the second version gets jitted to the same thing as the first. Semantically, the two are distinct in that the first does a single round of deferred execution while the second does two rounds. On those grounds I’d argue that the second would be more complex.
The real question you need to ask is: When you’re exposing the IEnumerable, what guarantees are you making? Are you saying that you want to simply provide forward iteration? Or are you stating that your interface provides deferred execution?
In the code below, my intent for is to simply provide forward enumeration without random access:
But here, I want to prevent unnecessary computation. I want my expensive computation to be performed only when a result is requested.
Even though both versions of
Foolook identical on the outside, their internal implementation does different things. That’s the part that you need to watch out for. When you use LINQ, you don’t need to worry about deferring execution since most operators do it for you. In your own code, you may wish to go with the first or second depending on your needs.