Using : C++ (MinGW), Qt4.7.4, Vista (OS), intel core2vPro
I need to process 2 huge files in exactly the same way. So i would like to call the processing routine from 2 separate threads for 2 separate files. The GUI thread does nothing heavy; just displays a label and runs an event loop to check for emission of thread termination conditions and quits the main Application accordingly. I expected this to utilize the two cores (intel core2) somewhat equally, but on the contrary i see from Task Manager that one of the cores is highly utilized and the other is not (though not every time i run the code); also the time taken to process the 2 files is much more than the time taken to process one file (i thought it should have been equal or a little more but this is almost equal to processing the 2 files one after another in a non-threaded application). Can i somehow force the threads to use the cores that i specify?
QThread* ptrThread1=new QThread;
QThread* ptrThread2=new QThread;
ProcessTimeConsuming* ptrPTC1=new ProcessTimeConsuming();
ProcessTimeConsuming* ptrPTC2=new ProcessTimeConsuming();
ptrPTC1->moveToThread(ptrThread1);
ptrPTC2->moveToThread(ptrThread2);
//make connections to specify what to do when processing ends, threads terminate etc
//display some label to give an idea that the code is in execution
ptrThread1->start();
ptrThread2->start(); //i want this thread to be executed in the core other than the one used above
ptrQApplication->exec(); //GUI event loop for label display and signal-slot monitoring
Reading in parallel from a single mechanical disk often times (and probably in your case) will not yield any performance gain, since the mechanical head of the disk needs to spin every time to seek the next reading location, effectively making your reads sequential. Worse, if a lot of threads are trying to read, the performance may even degrade with respect to the sequential version, because the disk head is bounced to different locations of the disk and thus needs to spin back where it left off every time.
Generally, you cannot do better than reading the files in a sequence and then processing them in parallel using perhaps a producer-consumer model.