Possible Duplicate:
Threads synchronization. How exactly lock makes access to memory 'correct'?
This question is inspired by this one.
We got a following test class
class Test
{
private static object ms_Lock=new object();
private static int ms_Sum = 0;
public static void Main ()
{
Parallel.Invoke(HalfJob, HalfJob);
Console.WriteLine(ms_Sum);
Console.ReadLine();
}
private static void HalfJob()
{
for (int i = 0; i < 50000000; i++) {
lock(ms_Lock) { }// empty lock
ms_Sum += 1;
}
}
}
Actual result is very close to expected value 100 000 000 (50 000 000 x 2, since 2 loops are running at the same time), with around 600 – 200 difference (mistake is approx 0.0004% on my machine which is very low). No other way of synchronization can provide such way of approximation (its either a much bigger mistake % or its 100% correct)
We currently understand that such level of preciseness is because of program runs in the following way:

Time is running left to right, and 2 threads are represented by two rows.
where
-
black box represents process of acquiring, holding and releasing the
-
lock plus represents addition operation ( schema represents scale on
my PC, lock takes approximated 20 times longer than add) - white box represents period that consists of try to acquire lock,
and further awaiting for it to become available
Also lock provides full memory fence.
So the question now is: if above schema represents what is going on, what is the cause of such big error (now its big cause schema looks like very strong syncrhonization schema)? We could understand difference between 1-10 on boundaries, but its clearly no the only reason of error? We cannot see when writes to ms_Sum can happen at the same time, to cause the error.
EDIT: many people like to jump to quick conclusions. I know what synchronization is, and that above construct is not a real or close to good way to synchronize threads if we need correct result. Have some faith in poster or maybe read linked answer first. I don’t need a way to synchronize 2 threads to perform additions in parallel, I am exploring this extravagant and yet efficient , compared to any possible and approximate alternative, synchronization construct (it does synchronize to some extent so its not meaningless like suggested)
This is a very tight loop with not much going on inside it, so
ms_Sum += 1has a reasonable chance of being executed in “just the wrong moment” by the parallel threads.Why would you ever write a code like this in practice?
Why not:
or just:
?
— EDIT —
Some comments on why would you see the error despite memory barrier aspect of the lock… Imagine the following scenario:
lock, leaves thelockand then is pre-empted by the OS scheduler.lock(possibly once, possibly more than once, possibly millions of times).ms_Sum += 1at the same time, resulting in some increments being lost (because increment = load + add + store).