I was doing some performance metrics and I ran into something that seems quite odd to me. I time the following two functions:
private static void DoOne()
{
List<int> A = new List<int>();
for (int i = 0; i < 200; i++) A.Add(i);
int s=0;
for (int j = 0; j < 100000; j++)
{
for (int c = 0; c < A.Count; c++) s += A[c];
}
}
private static void DoTwo()
{
List<int> A = new List<int>();
for (int i = 0; i < 200; i++) A.Add(i);
IList<int> L = A;
int s = 0;
for (int j = 0; j < 100000; j++)
{
for (int c = 0; c < L.Count; c++) s += L[c];
}
}
Even when compiling in release mode, the timings results were consistently showing that DoTwo takes ~100 longer then DoOne:
DoOne took 0.06171706 seconds.
DoTwo took 8.841709 seconds.
Given the fact the List directly implements IList I was very surprised by the results. Can anyone clarify this behavior?
The gory details
Responding to questions, here is the full code and an image of the project build preferences:
using System;
using System.Collections.Generic;
using System.Text;
using System.Diagnostics;
using System.Collections;
namespace TimingTests
{
class Program
{
static void Main(string[] args)
{
Stopwatch SW = new Stopwatch();
SW.Start();
DoOne();
SW.Stop();
Console.WriteLine(" DoOne took {0} seconds.", ((float)SW.ElapsedTicks) / Stopwatch.Frequency);
SW.Reset();
SW.Start();
DoTwo();
SW.Stop();
Console.WriteLine(" DoTwo took {0} seconds.", ((float)SW.ElapsedTicks) / Stopwatch.Frequency);
}
private static void DoOne()
{
List<int> A = new List<int>();
for (int i = 0; i < 200; i++) A.Add(i);
int s=0;
for (int j = 0; j < 100000; j++)
{
for (int c = 0; c < A.Count; c++) s += A[c];
}
}
private static void DoTwo()
{
List<int> A = new List<int>();
for (int i = 0; i < 200; i++) A.Add(i);
IList<int> L = A;
int s = 0;
for (int j = 0; j < 100000; j++)
{
for (int c = 0; c < L.Count; c++) s += L[c];
}
}
}
}
Thanks for all the good answers (especially @kentaromiura). I would have closed the question, though I feel we still miss an important part of the puzzle. Why would accessing a class via an interface it implements be so much slower? The only difference I can see is that accessing a function via an Interface implies using virtual tables while the normally the functions can be called directly. To see whether this is the case I made a couple of changes to the above code. First I introduced two almost identical classes:
public class VC
{
virtual public int f() { return 2; }
virtual public int Count { get { return 200; } }
}
public class C
{
public int f() { return 2; }
public int Count { get { return 200; } }
}
As you can see VC is using virtual functions and C doesn’t. Now to DoOne and DoTwo:
private static void DoOne()
{ C a = new C();
int s=0;
for (int j = 0; j < 100000; j++)
{
for (int c = 0; c < a.Count; c++) s += a.f();
}
}
private static void DoTwo()
{
VC a = new VC();
int s = 0;
for (int j = 0; j < 100000; j++)
{
for (int c = 0; c < a.Count; c++) s += a.f();
}
}
And indeed:
DoOne took 0.01287789 seconds.
DoTwo took 8.982396 seconds.
This is even more scary – virtual function calls 800 times slower?? so a couple of question to the community:
- Can you reproduce? (given the
fact that all had worse performance
before, but not as bad as mine) - Can you explain?
- (this may be the
most important) – can you think of
a way to avoid?
Boaz
First I want to thank all for their answers. It was really essential in the path figuring our what was going on. Special thanks goes to @kentaromiura which found the key needed to get to the bottom of things.
The source of the slow down of using List<T> via an IList<T> interface is the lack of the ability of the JIT complier to inline the Item property get function. The use of virtual tables caused by accessing the list through it’s IList interface prevents that from happening.
As a proof , I have written the following code:
and modified the DoOne and DoTwo classes to the following:
Sure enough the function times are now very similar to before:
Now, if you remove the comments before the MethodImpl in the C class (forcing the JIT not to inline) – the timing becomes:
Voila – the methods take almost the same time. You can stil see that method DoOne is still slightly fast , which is consistent which the extra overhead of a virtual function.