I have a program that needs to run a function M times per iteration, and those runs can be parallelized. Lets say I’m limited to running N threads at a time (say by the number of cores available). I need an algorithm that will make sure I’m always running N threads (so long as the number of threads remaining is >= N) and that algorithm needs to be invariant to the completion order of those threads. Also, the thread scheduling algorithm should not claim significant CPU time.
I have something like the following in mind, but its clearly flawed.
#include <iostream>
#include <pthread.h>
#include <cstdlib>
void *find_num(void* arg)
{
double num = rand();
for(double q=0; 1; q++)
if(num == q)
{
std::cout << "\n--";
return 0;
}
}
int main ()
{
srand(0);
const int N = 2;
pthread_t threads [N];
for(int q=0; q<N; q++)
pthread_create(&threads [q], NULL, find_num, NULL);
int M = 30;
int launched=N;
int finnished=0;
while(1)
{
for(int w=0; w<N; w++)
{
//inefficient if `threads [1]` done before `threads [2]`
pthread_join( threads [w], NULL);
finnished++;
std::cout << "\n" << finnished;
if(finnished == M)
break;
if(launched < M)
{
pthread_create(&threads [w], NULL, find_num, NULL);
launched++;
}
}
if(finnished == M)
break;
}
}
The obvious problem here is that if threads[1] finishes before threads[0] there is wasted CPU time, and I can’t think of how to get around that. Also, I’m assuming that having the main routine waiting on pthread_join() is not a significant drain on CPU time?
I would advice against respawining threads, it’s a rather serious overhead. Instead, create a pool of N threads and submit work to them via a work-queue, a rather standard approach. Even if your remaining work is less than N, the extra threads will not do any harm, they’ll just stay there blocked in the work-queue.
If you insist on your current approach you can do like this:
Do not wait for threads with
pthread_join, you don’t need it, since you’re not communicating anything back to the main thread. Create the threads with the attributePTHREAD_CREATE_DETACHEDand just let them exit.In the main thread, wait on a semaphore, which is signaled by each exiting thread – in effect you would wait for any thread termination. If you don’t have
<semaphore.h>for any reason, it’s trivial to implement it with mutexes and conditions.Anyway, I would again recommend thread-pool/work-queue approach.