When we create a thread with pthread_create, should we place the pthread_join immediate?
For example I have the following two codes, but I do not know why it does not work.
For the 1st version, the output is not deterministic.
#include<iostream>
#include<pthread.h>
#include<cstring>
#include<cstdlib>
#define ROW 3
#define COL 3
using namespace std;
typedef struct {
int row;
int col;
} para;
void print(double * para)
{
for(int i=0;i<3;i++)
{
for(int j=0;j<3;j++)
{
cout<<*(para+3*i+j)<<"\t";
}
cout<<endl;
}
}
double mat[9]={1,2,3,4,5,6,7,8,9};
double * result=(double *) malloc(9*sizeof(double));
void * mul(void * arg)
{
para * temp=(para *) arg;
int row=temp->row;
int col=temp->col;
double sum=0;
for(int i=0;i<3;i++)
{
double a=*(mat+row*3+i);
double b=*(mat+i+3*col);
sum+=a*b;
}
*(result+row*3+col)=sum;
int main()
{
pthread_t thread[9];
for(int i=0;i<9;i++)
{
para M;
M.row=i/3;
M.col=i%3;
pthread_create(&thread[i],NULL,mul,&M);
}
for(int i=0;i<9;i++)
{
pthread_join(thread[i],NULL);
}
print(result);
}
With the 2nd version, the output is correct.
#include<iostream>
#include<pthread.h>
#include<cstring>
#include<cstdlib>
#define ROW 3
#define COL 3
using namespace std;
typedef struct {
int row;
int col;
} para;
void print(double * para)
{
for(int i=0;i<3;i++)
{
for(int j=0;j<3;j++)
{
cout<<*(para+3*i+j)<<"\t";
}
cout<<endl;
}
}
double mat[9]={1,2,3,4,5,6,7,8,9};
double * result=(double *) malloc(9*sizeof(double));
void * mul(void * arg)
{
para * temp=(para *) arg;
int row=temp->row;
int col=temp->col;
double sum=0;
for(int i=0;i<3;i++)
{
double a=*(mat+row*3+i);
double b=*(mat+i+3*col);
sum+=a*b;
}
*(result+row*3+col)=sum;
int main()
{
pthread_t thread[9];
for(int i=0;i<9;i++)
{
para M;
M.row=i/3;
M.col=i%3;
pthread_create(&thread[i],NULL,mul,&M);
pthread_join(thread[i],NULL);
}
print(result);
}
What is the difference between these two usages? And why the first code has something wrong?
The first version starts nine threads.
Then once all threads have been created it waits for them all to finish before exiting.
Thus you get nine threads running in parallel.
The second version starts nine threads.
But after each thread is started it waits for the thread to exit before continuing.
Thus you get nine threads running serially.
Unfortunately the first version is also broken.
The data object passed to the thread (as the 4th parameter (
&M)) is an automatic variable that goes out of scope potentially before the thread completes.Fix like this: