I notice that
Console.WriteLine((object) new string(' ', 0) == (object) new string(' ', 0));
prints true, which indicates that the CLR keeps the empty string around and re-uses the same instance. (It prints false for any other number than 0.)
However, the same is not true for arrays:
Console.WriteLine(new int[0] == new int[0]); // False
Now, if we look at the implementation of Enumerable.Empty<T>(), we find that it caches and re-uses empty arrays:
public static IEnumerable<TResult> Empty<TResult>()
{
return EmptyEnumerable<TResult>.Instance;
}
[...]
public static IEnumerable<TElement> Instance
{
get
{
if (EmptyEnumerable<TElement>.instance == null)
EmptyEnumerable<TElement>.instance = new TElement[0];
return EmptyEnumerable<TElement>.instance;
}
}
So the framework team felt that keeping an empty array around for every type is worth it. The CLR could, if it wanted to, go a small step further and do this natively so it applies not only to calls to Enumerable.Empty<T>() but also new T[0]. If the optimisation in Enumerable.Empty<T>() is worth it, surely this would be even more worth it?
Why does the CLR not do this? Is there something I’m missing?
Strings may use interning, that makes them a different story (from all other kind of objects).
Arrays are essentially just objects. Re-using instances where that is not clear from the syntax or context isn’t without side effects or risks.
If some other code locked on another (they thought) empty
int[]you might have a deadlock that is very hard to find.Other scenarios include using arrays as the key in a Dictionary, or anywhere else their identity matters. The framework can’t just go around changing the rules.