I’m developing a Ray Tracer in C++ using SDL and Pthread. I’m having issues making my program utilize two cores. The threads work, but they don’t use both cores to 100%. To interface SDL I write directly to it’s memory, SDL_Surface.pixels, so I assume that it can’t be SDL locking me.
My thread function looks like this:
void* renderLines(void* pArg){ while(true){ //Synchronize pthread_mutex_lock(&frame_mutex); pthread_cond_wait(&frame_cond, &frame_mutex); pthread_mutex_unlock(&frame_mutex); renderLinesArgs* arg = (renderLinesArgs*)pArg; for(int y = arg->y1; y < arg->y2; y++){ for(int x = 0; x < arg->width; x++){ Color C = arg->scene->renderPixel(x, y); putPixel(arg->screen, x, y, C); } } sem_post(&frame_rendered); } }
Note: scene->renderPixel is const, so I assume both threads can read from the same memory. I have two worker threads doing this, in my main loop I make these work using:
//Signal a new frame pthread_mutex_lock(&frame_mutex); pthread_cond_broadcast(&frame_cond); pthread_mutex_unlock(&frame_mutex); //Wait for workers to be done sem_wait(&frame_rendered); sem_wait(&frame_rendered); //Unlock SDL surface and flip it...
Note: I’ve also tried creating and joining the threads instead of synchronizing them. I compile this with ‘-lpthread -D_POSIX_PTHREAD_SEMANTICS -pthread’ and gcc does not complain.
My problem is best illustrated using a graph of the CPU usage during execution: 
(source: jopsen.dk)
As can be seen from the graph my program only uses one core at a time, then switching between the two every once in a while, but it doesn’t drive both to 100% ever. What in the world have I done wrong? I’m not using any mutex or semaphors in scene. What can I do to find the bug?
Also if I put while(true) around scene->renderPixel() I can push both cores to 100%. So I’ve suspected that this is caused by overhead, but I only synchronize every 0.5 second (e.g. FPS: 0.5), given a complex scene. I realize it might not be easy to tell me what my bug is, but an approach to debugging this would be great too… I haven’t played with pthreads before…
Also, can this be a hardware or kernel issue, my kernel is:
$uname -a Linux jopsen-laptop 2.6.27-14-generic #1 SMP Fri Mar 13 18:00:20 UTC 2009 i686 GNU/Linux
Note:
This is useless :
if you wait to wait for a new frame do something like :
int new_frame = 0;
First thread :
other thread :
pthread_cond_wait(), actually release the mutex, and unschedule the thread until the condition is signaled. When the condition is signaled the thread is waken up and the mutex is re-taken. All this happen inside the pthread_cond_wait() function