This code takes timings for n invocations of two methods that both do nothing (MyMethod1, MyMethod2) but are decorated with different attributes (MyAspect1, MyAspect2).
MyAspect1 contains no logic apart from letting MyMethod1 run (Proceed()).
MyAspect2 does the same for MyMethod2, but also creates a Task, supplying an anonymous delegate to run, which is meant to equate to near zero work for the CPU (trimming a string).
I’d expect timings for iterations over MyMethod1 and MyMethod2 to be roughly the same.
The actual timings are below; they seem counter intuitive, with MyMethod1 starting off significantly slower (counter to what I’d expect), and eventually with it losing the race significantly from 10000 iterations onwards. Can anyone help me explain this?
Test results (average of 3 runs per iteration value):
No. of iterations | Ticks
Method 1 | Method 2
5 25282 6999
100 22128 8176
1000 22982 23720
10000 25995 265621
1000000 994359 25125076
---------------------------------------------------
public class TestClass2
{
public void MyTest()
{
const int iterations = 100;
var stopWatch = new Stopwatch();
stopWatch.Start();
for (int x = 0; x < iterations; x++)
{
MyMethod1();
}
stopWatch.Stop();
Console.WriteLine("Method1 calls duration: " + stopWatch.ElapsedTicks);
stopWatch.Reset();
stopWatch.Start();
for (int x = 0; x < iterations; x++)
{
MyMethod2();
}
stopWatch.Stop();
Console.WriteLine("Method2 calls duration: " + stopWatch.ElapsedTicks);
}
[MyAspect1]
private void MyMethod1() {}
[MyAspect2]
private void MyMethod2() {}
}
[Serializable]
public class MyAspect1 : AsynchronousMetricsAspect
{
public override void OnInvoke(MethodInterceptionArgs args)
{
args.Proceed();
}
}
//NOTE: this type is the same as MyAspect1, but adds a Task creation
[Serializable]
public class MyAspect2 : AsynchronousMetricsAspect
{
public override void OnInvoke(MethodInterceptionArgs args)
{
args.Proceed();
Task.Factory.StartNew(() => "bleh".Trim()); //invoke the logging method asynchronously
}
}
Further info: Test is being invoked from ReSharper NUnit VS plugin and the attributes are PostSharp MethodInterceptionAspects.
Edit: Updated figures for when the methods are pre-invoked before the tests, to ensure any JIT compilation has occurred.
Test results (ballpark):
No. of iterations | Ticks
Method 1 | Method 2
5 22 437
100 37 2204
1000 192 24476
10000 7931 286403
100000 115451 2862439
1000000 695950 29049021
10000000 8347867 275777590
Well, to start with I’d try to determine whether this is anything to do with PostSharp or not. I would try to call the same methods without any AOP being involved at all.
Here are my suspicions though:
The 25000 ticks (a pretty short length of time, mind you) is due to JITting bits of PostSharp infrastructure. They’re therefore a constant cost, which doesn’t have to be paid again by the second method. You can verify this by switching round the order of your tests. However, given the numbers, I suspect there’s another constant cost incurred by setting up task factories etc… which would explain why 100 iterations of method 2 doesn’t take much longer than 5 iterations.
Creating a task is not free, and I wouldn’t expect it to be. Just the fact that you’re creating an object means there’s work going on… and then there’s the scheduling and coordination across tasks and threads. Parallel Extensions is good, but it can’t work miracles. I don’t know how PFX will spread the work out, but it’s possible that some of that extra work being generated (the tasks are short, but they still have to be executed) ends up on the same core as the main thread. You’re creating a lot of tasks very quickly.
It’s possible that the difference between the two methods is enough to change whether they can be inlined or not. This is obfuscated somewhat by the use of PostSharp, but I wouldn’t be entirely surprised to find that that’s the case.
The scaling isn’t quite linear in the case of method 2, but it’s not far off. If you take a formula of
6000 + 23 * Nas a quick and dirty approximation, you would expect values of 6115, 8300, 29000, 236000 and 23006000 – which aren’t very far off.I strongly suspect that the huge figure for the first method in the last test is due to garbage collection and the heap growing. You’re creating plenty of objects here, and I suspect it’s doing a full GC and expanding the various generations appropriately – method 2 in the same test then takes advantage of that larger heap.
Those are really just guesses, but they provide some kind of explanation for everything you’re seeing 🙂