it is rather strange, I had thought we should always put high chance clause into the front of nested if-elses, until today.
Brief setup:
an array Zoo[] contains 10,000 objects of 5 classes, based on the weights, e.g. 4,3,2,1,0 (means 4000 Cats, 3000 Dogs, 2000 Chickens, 1000 Rabbits, 0 Owls) and it can either be shuffled or not (exactly in order).
Then Use if-else to check each array members.
Results: Time (ms)
Weights 43210 01234 22222 43210 01234 22222
Shuffle Yes Yes Yes No No No
Polymorphism 101 100 107 26 26 27
If Else 77 28 59 17 16 17
If Else Reverse 28 77 59 16 17 16
Switch 21 21 21 18 19 18
It caught my eye when I see the If-Else reverse is much better than if-else. Here if-else exams Cat->Dog->Chicken->Rabbit->Owl, reversed version checks them in reverse order.
Also, could someone explain in the non shuffle version every method gain great improvement? (I would assume due to cache or better hit rate in memory?)
Update
Weights 27 9 3 1 0 0 1 3 9 27 27 9 3 1 0 0 1 3 9 27
Shuffle Yes Yes No No
Polymorphism 84 82 27 27
If Else 61 20 17 16
If Else Reverse 20 60 16 17
Switch 21 21 18 18
Code follows:
class Animal : AnimalAction
{
public virtual int Bart { get; private set; }
public int Type { get; private set; }
public Animal(int animalType)
{
this.Type = animalType;
}
}
interface AnimalAction
{
int Bart { get; }
}
class Cat : Animal
{
public Cat()
: base(0)
{
}
public override int Bart
{
get
{
return 0;
}
}
}
class Dog : Animal
{
public Dog()
: base(1)
{
}
public override int Bart
{
get
{
return 1;
}
}
}
class Chicken : Animal
{
public Chicken()
: base(2)
{
}
public override int Bart
{
get
{
return 2;
}
}
}
class Rabbit : Animal
{
public Rabbit()
: base(3)
{
}
public override int Bart
{
get
{
return 3;
}
}
}
class Owl : Animal
{
public Owl()
: base(4)
{
}
public override int Bart
{
get
{
return 4;
}
}
}
class SingleDispatch
{
readonly Animal[] Zoo;
int totalSession;
SingleDispatch(int totalSession, int zooSize)
{
this.totalSession = totalSession;
Zoo = new Animal[zooSize];
int[] weights = new int[5] { 0, 1, 2, 3, 4 };
int totalWeights = weights.Sum();
int[] tiers = new int[4];
int accumulated = 0;
for (int i = 0; i < 4; i++)
{
accumulated += weights[i] * zooSize / totalWeights;
tiers[i] = accumulated;
}
for (int i = 0; i < tiers[0]; i++)
{
Animal nextAnimal = new Cat();
Zoo[i] = nextAnimal;
}
for (int i = tiers[0]; i < tiers[1]; i++)
{
Animal nextAnimal = new Dog();
Zoo[i] = nextAnimal;
}
for (int i = tiers[1]; i < tiers[2]; i++)
{
Animal nextAnimal = new Chicken();
Zoo[i] = nextAnimal;
}
for (int i = tiers[2]; i < tiers[3]; i++)
{
Animal nextAnimal = new Rabbit();
Zoo[i] = nextAnimal;
}
for (int i = tiers[3]; i < zooSize; i++)
{
Animal nextAnimal = new Owl();
Zoo[i] = nextAnimal;
}
Zoo.FisherYatesShuffle();
}
public static void Benchmark()
{
List<Tuple<string, double>> result = new List<Tuple<string, double>>();
SingleDispatch myBenchmark = new SingleDispatch(1000, 10000);
result.Add(TestContainer.RunTests(10, myBenchmark.SubClassPoly));
result.Add(TestContainer.RunTests(10, myBenchmark.Ifelse));
result.Add(TestContainer.RunTests(10, myBenchmark.IfelseReverse));
result.Add(TestContainer.RunTests(10, myBenchmark.Switch));
foreach (var item in result)
{
Console.WriteLine("{0,-30}{1:N0}", item.Item1, item.Item2);
}
Console.WriteLine();
}
void SubClassPoly()
{
long sum = 0;
for (int i = 0; i < totalSession; i++)
{
foreach (var myAnimal in Zoo)
{
sum += myAnimal.Bart;
}
}
}
void Ifelse()
{
long sum = 0;
for (int i = 0; i < totalSession; i++)
{
foreach (var myAnimal in Zoo)
{
if (myAnimal.Type == 0)
{
sum += 0;
}
else if (myAnimal.Type == 1)
{
sum += 1;
}
else if (myAnimal.Type == 2)
{
sum += 2;
}
else if (myAnimal.Type == 3)
{
sum += 3;
}
else
{
sum += 4;
}
}
}
}
void IfelseReverse()
{
long sum = 0;
for (int i = 0; i < totalSession; i++)
{
foreach (var myAnimal in Zoo)
{
if (myAnimal.Type == 4)
{
sum += 4;
}
else if (myAnimal.Type == 3)
{
sum += 3;
}
else if (myAnimal.Type == 2)
{
sum += 2;
}
else if (myAnimal.Type == 1)
{
sum += 1;
}
else
{
sum += 0;
}
}
}
}
void Switch()
{
long sum = 0;
for (int i = 0; i < totalSession; i++)
{
foreach (var myAnimal in Zoo)
{
switch (myAnimal.Type)
{
case 0:
sum += 0;
break;
case 1:
sum += 1;
break;
case 2:
sum += 2;
break;
case 3:
sum += 3;
break;
case 4:
sum += 4;
break;
default:
break;
}
}
}
}
}
Branch Prediction. http://igoro.com/archive/fast-and-slow-if-statements-branch-prediction-in-modern-processors/
For the non shuffled case it is much easier to understand. Assume we have a very simple predictor that guesses that the next result will be the same as the previous result:
e.g. (c=cat,d=dog,o=owl)
animal: CCCCC DDDDD OOOOO
prediction: *CCCC CDDDD DOOOO
Correct: NYYY NYYY NYYYY
As you can see the predictions are only wrong when the animal changes. So, with a thousand animals of each type the predictor is right over 99% of the time.
But, the predictor doesn’t really work that way,
What is really happening** is that each if branch is being predicted to be true or false.
Assuming a (40%,30%,20%,10%,0%) distribution like in your example:
if (Animal.Type == MostCommonType) true less than half the time (40%) 40 out of 100 (40+30+20+10+0)
else if (animal.Type == SecondMostCommonType) //true 50% of the time, 30 out of 60 (30+20+10 + 0)
else if (animal.Type == ThirdMostCommonType) // true 66% of the time 20 out of 30 (20+10)
else if (animal.Type == FourtMostCommonType) // true 100% of the time 10 out of 10 (10 +0)
40%, 50%, and 60% odds don’t give the predictor much to work with, and the only good prediction (100%) is on the least common type and least common code path.
However, if you reverse the if order:
if (animal.Type == FifthMostCommonType) //False 100% of the time 0 out of 100 (40+30+20+10+0)
else if (animal.Type == FourtMostCommonType) //False 90% of the time 10 out of 100 (40+30+20+10)
else if (Animal.Type == MostCommonType) //False 77% of the time 20 out of 90 (40+30+20+)
else if (animal.Type == SecondMostCommonType) //true 57 % of the time, 30 out of 70 (40+30)
else if (animal.Type == ThirdMostCommonType) // true 100% of the time 40 out of 40 (40+)
Nearly all comparisons are highly predicable.
Predicting that the next animal will NOT be the least common animal will be correct more than any other prediction.
In short, the total cost of the missed branch predictions in this case is higher than the cost of doing more branches (i.e. if statements)
I hope that clears it up a little. Please let me know if any parts are unclear, I’ll try to clarify.
**well not really really, but much closer to the truth.
Edit:
The branch predictor in newer processor is fairly complex you can see more detail at http://en.wikipedia.org/wiki/Branch_predictor#Static_prediction
Shuffling confounds the predictor by removing the groups of similar data and making each guess or prediction likely to be correct.
Imagine a brand new deck of cards.
A friend picks up each card and asks you to guess if it is red or black.
At this point a fairly good algorithm would be to guess whatever the last card was. You would guess right nearly every time. > 90%
After shuffling the deck however, this algorithm would only give 50% accuracy.
In fact no algorithm will give you significantly better than 50%. (as far as I know, counting the number of reds and blacks left is the only way to get an edge in this situation.)
Edit : Re Sub classing
I would guess that this is because of CPU / L1/2/etc cache misses.
Since each class implements the return value as a constant i.e. return 0 the return value is part of the function. I suspect if you re implemented the class as shown below you would force a cache miss on every call and see the same (bad) performance shuffled or not.