I am seeing unexpected behavior for mutexes/conditions in C POSIX threads depending on whether the mutex and condition variables are set in the global scope (which works) on in a struct (which sometimes works).
I programming on a Mac and then running the same code on a Linux machine. I copied the code from this example, which works as expected in both machines: http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=%2Fapis%2Fusers_73.htm
This example has the pthread_mutex_t and pthread_cond_t in the global scope:
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
...
pthread_mutex_lock(&mutex);
...
However if I change this to store the cond and mutex in a struct, it works on the Mac but does not work on Linux.
Here is an overview of the changes I made:
typedef struct _test_data_t {
pthread_mutex_t cond;
pthread_cond_t mutex;
} test_data_t;
...
pthread_mutex_lock(&(test_data->mutex));
...
Here is the output I get on Mac (which works)
Create 5 threads Thread blocked Thread blocked Thread blocked Thread blocked Thread blocked Wake up all waiting threads... Wait for threads and cleanup Main completed
Here is the output on Linux (which does not work)
Create 5 threads Thread blocked // Hangs here forever, other threads can't lock mutex
Does anyone know why this might be happening? I will admit that I am not a C expert so I don’t know what could have happened in the switch from using a global variable to a struct variable.
Thanks in advance for your help.
Here is the code (with some error checking stripped out for brevity):
typedef struct _test_data_t {
int conditionMet;
pthread_mutex_t cond;
pthread_cond_t mutex;
} test_data_t;
void *threadfunc(void *parm)
{
int rc;
test_data_t *test_data = (test_data_t *) parm;
rc = pthread_mutex_lock((pthread_mutex_t *)&(test_data->mutex));
while (!test_data->conditionMet) {
printf("Thread blocked\n");
rc = pthread_cond_wait(&test_data->cond, &test_data->mutex);
}
rc = pthread_mutex_unlock(&test_data->mutex);
return NULL;
}
void runThreadTest() {
int NTHREADS = 5;
int rc=0;
int i;
// Initialize mutex/condition.
test_data_t test_data;
test_data.conditionMet = 0;
rc = pthread_mutex_init(&test_data.mutex, NULL);
rc = pthread_cond_init(&test_data.cond, NULL);
// Create threads.
pthread_t threadid[NTHREADS];
printf("Create %d threads\n", NTHREADS);
for(i=0; i<NTHREADS; ++i) {
rc = pthread_create(&threadid[i], NULL, threadfunc, &test_data);
}
sleep(5);
rc = pthread_mutex_lock(&test_data.mutex);
/* The condition has occurred. Set the flag and wake up any waiting threads */
test_data.conditionMet = 1;
printf("Wake up all waiting threads...\n");
rc = pthread_cond_broadcast(&test_data.cond);
rc = pthread_mutex_unlock(&test_data.mutex);
printf("Wait for threads and cleanup\n");
for (i=0; i<NTHREADS; ++i) {
rc = pthread_join(threadid[i], NULL);
}
pthread_cond_destroy(&test_data.cond);
pthread_mutex_destroy(&test_data.mutex);
printf("Main completed\n");
}
The problem is that the member named
mutexis apthread_cond_tand the member namedcondis apthread_mutex_t. Casts that should be unnecessary may be hiding that fact.This line in the thread function should not need a cast:
However, you have several calls that don’t have the cast (so the compiler should be complaining loudly on those). I’m beginning to think this might be a typo that’s a red herring.