On working with thread-safety, I find myself always “double checking” before executing code in a lock block and I wondered if I was doing the right thing. Consider the following three ways of doing the same thing:
Example 1:
private static SomeCollection MyCollection;
private static Object locker;
private void DoSomething(string key)
{
if(MyCollection[key] == null)
{
lock(locker)
{
MyCollection[key] = DoSomethingExpensive();
}
}
DoSomethingWithResult(MyCollection[key]);
}
Example 2:
private static SomeCollection MyCollection;
private static Object locker;
private void DoSomething(string key)
{
lock(locker)
{
if(MyCollection[key] == null)
{
MyCollection[key] = DoSomethingExpensive();
}
}
DoSomethingWithResult(MyCollection[key]);
}
Example 3:
private static SomeCollection MyCollection;
private static Object locker;
private void DoSomething(string key)
{
if(MyCollection[key] == null)
{
lock(locker)
{
if(MyCollection[key] == null)
{
MyCollection[key] = DoSomethingExpensive();
}
}
}
DoSomethingWithResult(MyCollection[key]);
}
I always lean towards Example 3, and here’s why I think I’m doing the right thing
- Thread 1 enters
DoSomething(string) MyCollection[key] == nullso Thread 1 obtains a lock, just as Thread 2 entersMyCollection[key] == nullis still true, so Thread 2 waits to obtain the lock- Thread 1 calculates the value for
MyCollection[key]and adds it to the collection - Thread 1 releases the lock and calls
DoSomethingWithResult(MyCollection[key]); - Thread 2 obtains the lock, by which time
MyCollection[key] != null - Thread 2 does nothing, releases the lock and continues on its merry way
Example 1 would work, but there is a big risk that Thread 2 could redundantly calculate MyCollection[key].
Example 2 would work, but every thread would obtain a lock, even if it didn’t need to – which could be a (admittedly very small) bottleneck. Why hold up threads if you don’t need to?
Am I overthinking this and if so, what is the preferred way of handling these situations?
The first method should not be used. As you realised, it leaks, so more than one thread can end up running the expensive method. The longer that method takes, the bigger is the risk that another thread will also run it. In most cases it’s only a performance problem, but in some cases it might also be a problem that the resulting data is later on replaced by a new set of data.
The second method is the most common way, the third method is used if the data is accessed so frequently that the locking becomes a performance issue.