This is just something that’s been puzzling me ever since I read about iterators on Jon Skeet’s site.
There’s a simple performance optimisation that Microsoft has implemented with their automatic iterators – the returned IEnumerable can be reused as an IEnumerator, saving an object creation. Now because an IEnumerator necessarily needs to track state, this is only valid the first time it’s iterated.
What I cannot understand is why the design team took the approach they did to ensure thread safety.
Normally when I’m in a similar position I’d use what I consider to be a simple Interlocked.CompareExchange – to ensure that only one thread manages to change the state from “available” to “in process”.
Conceptually it’s very simple, a single atomic operation, no extra fields are required etc.
But the design teams approach? Every IEnumerable keeps a field of the managed thread ID of the creating thread, and then that thread ID is checked on calling GetEnumerator against this field, and only if it’s the same thread, and it’s the first time it’s called, can the IEnumerable return itself as the IEnumerator. It seems harder to reason about, imo.
I’m just wondering why this approach was taken. Are Interlocked operations far slower than two calls to System.Threading.Thread.CurrentThread.ManagedThreadId, so much so that it justifies the extra field?
Or is there some other reason behind this, perhaps involving memory models or ARM devices or something I’m not seeing? Maybe the spec imparts specific requirements on the implementation of IEnumerable? Just genuinely puzzled.
I can’t answer definatively, but as to your question:
Yes interlocked operations are much slower that two calls to get the ManagedThreadId – interlocked operations aren’t cheap because they required multi-CPU systems to synchonize their caches.
From Understanding the Impact of Low-Lock Techniques in Multithreaded Apps:
In Threading in C#, it lists overhead the overhead as 10ns. Whereas getting the
ManagedThreadIdshould be a normal non-locked read of static data.Now this is just my speculation, but if you think about the normal use case it would be to call the function to retrieve the
IEnumerableand immediately iterative over it once. So in the standard use case the object is:So this design brings in no synchronization overhead and sacrifices 4 bytes, which will probably only be in use for a very short period of time.
Of course to prove this you would have to do performance analysis to determine the relative costs and code analysis to prove what the common case was.