The question is about a post regarding conditional variable that i glanced over. condition variable
The author first give a bugged example:
/* in thread 1 */
pthread_mutex_lock(mx);
if (state == GOOD) {
pthread_mutex_unlock(mx); // Here !
wait_for_event();
pthread_mutex_lock(mx);
}
pthread_mutex_unlock(mx);
/* in thread 2 */
pthread_mutex_lock(mx);
state = GOOD;
pthread_mutex_unlock(mx);
signal_event(); /* expecting to wake thread 1 up */
and explains as follows:
‘This pseudocode sample carries a bug. What happens if scheduler decides to switch context from thread 1 to thread 2 after pthread_mutex_unlock(mx), but before wait_for_event(). In this case, thread 2 will not wake thread 1 and thread 1 will continue sleeping, possibly forever.’
I know how conditional variable should be used, as author demonstrated in same post later.
I can see that in this bugged example, the ‘state == GOOD’ judgement and ‘wait_for_event()’ is NOT locked as a whole by a mutex. And if thread 1 is context switched right after the first ‘pthread_mutex_unlock(mx);‘, thread 2 can change ‘state’ to something else (BAD?), and signal to wake up thread 1 to proceed in the ‘state == GOOD’ logic, which i think is wrong.
But why author says ‘In this case, thread 2 will not wake thread 1 and thread 1 will continue sleeping, possibly forever.’?
Isn’t ‘signal_event();’ still called in thread 2? Was my understanding correct at all?
The bug is caused by the semantics of
signal_event()andwait_for_event(). Ifsignal_event()is called when no one is stuck inwait_for_event(), the signal is lost.Besides a context switch, the same problem occurs if thread 2 runs fast and thread 1 is slow. In that case, the time in thread 1 between
and
could be when thread2 does all its operations, sending a signal into oblivion (because no one is waiting for it). Then thread 1 waits, and it will never get the signal (unless thread 2 runs again for some reason).