I was doing other experiments until this strange behaviour caught my eye.
code is compiled in x64 release.
if key in 1, the 3rd run of List method cost 40% more time than the first 2. output is
List costs 9312
List costs 9289
Array costs 12730
List costs 11950
if key in 2, the 3rd run of Array method cost 30% more time than the first 2. output is
Array costs 8082
Array costs 8086
List costs 11937
Array costs 12698
You can see the pattern, the complete code is attached following (just compile and run):
{the code presented is minimal to run the test. The actually code used to get reliable result is more complicated, I wrapped the method and tested it 100+ times after proper warmed up}
class ListArrayLoop
{
readonly int[] myArray;
readonly List<int> myList;
readonly int totalSessions;
public ListArrayLoop(int loopRange, int totalSessions)
{
myArray = new int[loopRange];
for (int i = 0; i < myArray.Length; i++)
{
myArray[i] = i;
}
myList = myArray.ToList();
this.totalSessions = totalSessions;
}
public void ArraySum()
{
var pool = myArray;
long sum = 0;
for (int j = 0; j < totalSessions; j++)
{
sum += pool.Sum();
}
}
public void ListSum()
{
var pool = myList;
long sum = 0;
for (int j = 0; j < totalSessions; j++)
{
sum += pool.Sum();
}
}
}
class Program
{
static void Main(string[] args)
{
Stopwatch sw = new Stopwatch();
ListArrayLoop test = new ListArrayLoop(10000, 100000);
string input = Console.ReadLine();
if (input == "1")
{
sw.Start();
test.ListSum();
sw.Stop();
Console.WriteLine("List costs {0}",sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
test.ListSum();
sw.Stop();
Console.WriteLine("List costs {0}", sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
test.ArraySum();
sw.Stop();
Console.WriteLine("Array costs {0}", sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
test.ListSum();
sw.Stop();
Console.WriteLine("List costs {0}", sw.ElapsedMilliseconds);
}
else
{
sw.Start();
test.ArraySum();
sw.Stop();
Console.WriteLine("Array costs {0}", sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
test.ArraySum();
sw.Stop();
Console.WriteLine("Array costs {0}", sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
test.ListSum();
sw.Stop();
Console.WriteLine("List costs {0}", sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
test.ArraySum();
sw.Stop();
Console.WriteLine("Array costs {0}", sw.ElapsedMilliseconds);
}
Console.ReadKey();
}
}
Short answer: It is because CRL has optimization for dispatching methods called on interface-type. As long as particular interface’s method call is made on the same type (that implements this interface), CLR uses fast dispatching routine (only 3 instructions) that only checks actual type of instance and in case of match it jumps directly on precomputed address of particular method. But when the same interface’s method call is made on instance of another type, CLR switches dispatching to slower routine (which can dispatch methods for any actual instance type).
Long answer:
Firstly, take a look at how the method System.Linq.Enumerable.Sum() is declared (I omitted validity checking of source parameter because it’s not important it this case):
So all types that implement IEnumerable< int > can call this extension method, including int[] and List< int >. Keyword foreach is just abbreviation for getting enumerator via IEnumerable< T >.GetEnumerator() and iterating through all values. So this method actually does this:
Now you can clearly see, that method body contains three method calls on interface-type variables: GetEnumerator(), MoveNext(), and Current (although Current is actually property, not method, reading value from property just calls corresponding getter method).
GetEnumerator() typically creates new instance of some auxiliary class, which implements IEnumerator< T > and thus is able to return all values one by one. It is important to note, that in case of int[] and List< int >, types of enumerators returned by GetEnumerator() ot these two classes are different. If argument source is of type int[], then GetEnumerator() returns instance of type SZGenericArrayEnumerator< int > and if source is of type List< int >, then it returns instance of type List< int >+Enumerator< int >.
Two other methods (MoveNext() and Current) are repeatedly called in tight loop and therefore their speed is crucial for overall performance. Unfortunatelly calling method on interface-type variable (such as IEnumerator< int >) is not as straightforward as ordinary instance method call. CLR must dynamically find out actual type of object in variable and then find out, which object’s method implements corresponding interface method.
CLR tries to avoid doing this time consuming lookup on every call with a little trick. When particular method (such as MoveNext()) is called for the first time, CLR finds actual type of instance on which this call is made (for example SZGenericArrayEnumerator< int > in case you called Sum on int[]) and finds address of method, that implements corresponding method for this type (that is address of method SZGenericArrayEnumerator< int >.MoveNext()). Then it uses this information to generate auxiliary dispatching method, which simply checks, whether actual instance type is the same as when first call was made (that is SZGenericArrayEnumerator< int >) and if it is, it directly jumps to the method’s address found earlier. So on subsequent calls, no complicated method lookup is made as long as type of instance remains the same. But when call is made on enumerator of different type (such as List< int >+Enumerator< int > in case of calculating sum of List< int >), CLR no longer uses this fast dispatching method. Instead another (general-purpose) and much slower dispatching method is used.
So as long as Sum() is called on array only, CLR dispatches calls to GetEnumerator(), MoveNext(), and Current using fast method. When Sum() is called on list too, CLR switches to slower dispatching method and therefore performance decreases.
If performance is your concern, implement your own separate Sum() extension method for every type, on which you want to call Sum(). This ensures that CLR will use fast dispatching method. For example:
Or even better, avoid using IEnumerable< T > interface at all (because it’s still brings noticeable overhead). For example:
Here are results from my computer: