The problem: I have a few text files (10) with numbers in them on every line. I need to have them split across some threads I create using the pthread library. These threads that are created (worker threads) are to find the largest prime number that gets sent to them (and over all the largest prime from all of the text files).
My current thoughts on solutions: I am thinking myself to have two arrays and all of the text files in one array and the other array will contain a binary file that I can read say 1000 lines and send the pointer to the index of that binary file in a struct that contains the id, file pointer, and file position and let it crank through that.
A little bit of what I am talking about:
pthread_create(&threads[index],NULL,workerThread,(void *)threadFields[index]);//Pass struct to each worker
Struct:
typedef struct threadFields{
int *id, *position;
FILE *Fin;
}tField;
If anyone has any insight or a better solution it would be greatly appreciated
EDIT:
Okay so I found a solution to my problem and I believe it is similar to what SaveTheRbtz suggested. Here is what I implemented:
I took the files and merged them in to 1 binary file and kept tack of it in the loop (I had to account for how many bytes each entry was, this was hard-coded)
struct threadFields *info = threadStruct;
int index;
int id = info->id;
unsigned int currentNum = 0;
int Seek = info->StartPos;
unsigned int localLargestPrime = 0;
char *buffer = malloc(50);
int isPrime = 0;
while(Seek<info->EndPos){
for(index = 0; index < 1000; index++){//Loop 1000 times
fseek(fileOut,Seek*sizeof(char)*20, SEEK_SET);
fgets(buffer,20,fileOut);
Seek++;
currentNum = atoi(buffer);
if(currentNum>localLargestPrime && currentNum > 0){
isPrime = ChkPrim(currentNum);
if( isPrime == 1)
localLargestPrime = currentNum;
}
}
Can you do ten threads, each of which processes a file specified as an argument. Each thread will read its own file, checking whether the value is larger than the largest prime it has recorded so far, and if so, checking that the new number is prime. Then, when its finished, it can return the prime to the coordinator thread. The coordinator threads sits back and waits for the threads to finish, collecting the largest prime from each thread, and only keeping the largest. You can probably use 0 as a sentinel value to indicate ‘no primes found (yet)’.
I’d have the 11th thread do
pthread_exit()immediately. If you want to make coordination problems for yourself, you can, but why make life harder than you have to.If you absolutely must have 11 threads process 10 files and divvy up the work, then I suppose I would probably have set of 10 file streams initially in a queue. The threads would wait on a condition ‘queue not empty’ to get a file stream (mutexes and conditions and all that). When a thread acuires a file stream, it would read one number from the file and push the stream back onto the queue (signalling queue not empty), then process the number. On EOF, a thread would close the file and not push it back onto the queue (so the threads have to detect ‘no file streams left with unread data’). This means that each thread would read about one eleventh of the data, depending on how long the prime calculation takes for the numbers it actually reads. That’s much, much, much trickier to code than a simple one thread per file solution, but it scales (more or less) to an arbitrary number of threads and files. In particular, it could be used to have 7 threads process 10 files, as well as having 17 threads process 10 files.