(btw. This refers to 32 bit OS)
SOME UPDATES:
-
This is definitely an alignment issue
-
Sometimes the alignment (for whatever reason?) is so bad that access to the double is more than 50x slower than its fastest access.
-
Running the code on a 64 bit machine cuts down the issue, but I think it was still alternating between two timing (of which I could get similar results by changing the double to a float on a 32 bit machine)
-
Running the code under mono exhibits no issue — Microsoft, any chance you can copy something from those Novell guys???
Is there a way to memory align the allocation of classes in c#?
The following demonstrates (I think!) the badness of not having doubles aligned correctly. It does some simple math on a double stored in a class, timing each run, running 5 timed runs on the variable before allocating a new one and doing it over again.
Basically the results looks like you either have a fast, medium or slow memory position (on my ancient processor, these end up around 40, 80 or 120ms per run)
I have tried playing with StructLayoutAttribute, but have had no joy – maybe something else is going on?
class Sample
{
class Variable { public double Value; }
static void Main()
{
const int COUNT = 10000000;
while (true)
{
var x = new Variable();
for (int inner = 0; inner < 5; ++inner)
{
// move allocation here to allocate more often so more probably to get 50x slowdown problem
var stopwatch = Stopwatch.StartNew();
var total = 0.0;
for (int i = 1; i <= COUNT; ++i)
{
x.Value = i;
total += x.Value;
}
if (Math.Abs(total - 50000005000000.0) > 1)
throw new ApplicationException(total.ToString());
Console.Write("{0}, ", stopwatch.ElapsedMilliseconds);
}
Console.WriteLine();
}
}
}
So I see lots of web pages about alignment of structs for interop, so what about alignment of classes?
(Or are my assumptions wrong, and there is another issue with the above?)
Thanks,
Paul.
Interesting look in the gears that run the machine. I have a bit of a problem explaining why there are multiple distinct values (I got 4) when a double can be aligned only two ways. I think alignment to the CPU cache line plays a role as well, although that only adds up to 3 possible timings.
Well, nothing you can do about it, the CLR only promises alignment for 4 byte values so that atomic updates on 32-bit machines are guaranteed. This is not just an issue with managed code, C/C++ has this problem too. Looks like the chip makers need to solve this one.
If it is critical then you could allocate unmanaged memory with Marshal.AllocCoTaskMem() and use an unsafe pointer that you can align just right. Same kind of thing you’d have to do if you allocate memory for code that uses SIMD instructions, they require a 16 byte alignment. Consider it a desperation-move though.