I am surprised to see from pstack that this code leads to deadlock! I don’t see a reason for the same.
pthread_mutex_t lock;
_Cilk_for (int i = 0; i < N; ++i) {
int ai = A[i];
if (ai < pivot) {
pthread_mutex_lock(&lock);
A[ia++] = ai;
pthread_mutex_unlock(&lock);
}
else if (ai > pivot) {
pthread_mutex_lock(&lock);
A[ib++] = ai;
pthread_mutex_unlock(&lock);
}
else {
pthread_mutex_lock(&lock);
A[ic++] = ai;
pthread_mutex_unlock(&lock);
}
}
I am just using mutexes to make sure that access to A is atomic and serialized.
- What is wrong with this code to lead to deadlock?
- Is there a better way to implement this?
If that’s code inside a function, then you’re not initialising the mutex correctly. You need to set it to
PTHREAD_MUTEX_INITIALIZER(for a simple, default mutex) or do apthread_mutex_init()on it (for more complex requirements). Without proper initialisation, you don’t know what state the mutex starts in – it may well be in a locked state simply because whatever happened to be on the stack at that position looked like a locked mutex.That’s why it always needs to be initialised somehow, so that there is no doubt of the initial state.
Another potential problem you may have is this:
You probably should protect that access with the same mutex since otherwise you may read it in a “half-state” (when another thread is only part way through updating the variable).
And, I have to say, I’m not sure that threads are being used wisely here. The use of mutexes is likely to swamp a statement like
A[ia++] = aito the point where the vast majority of time will be spent locking and unlocking the mutex. They’re generally more useful where the code being processed during the lock is a little more substantial.You may find a non-threaded variant will blow this one out of the water (but, of course, don’t take my word for it – my primary optimisation mantra is “measure, don’t guess”).