Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6884547
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T05:31:22+00:00 2026-05-27T05:31:22+00:00

I have implemented a work queue pattern in C (within a python extension) and

  • 0

I have implemented a work queue pattern in C (within a python extension) and I am disappointed with performance.

I have a simulation with a list of particles (“elements”), and I benchmark the time taken to perform all the calculations required for a timestep and record this along with the number of particles involved. I am running the code on a quad-core hyperthreaded i7, so I was expecting for performance to rise (time taken to fall) with the number of threads up to about 8, but instead the fastest implementation has no worker threads (functions are simply executed instead of added to the queue,) and with each worker thread the code gets slower and slower (by a step of more than the time for the unthreaded implementation for each new thread!) I’ve had a quick peek in my processor usage application, and it seems python never really exceeds 130% CPU usage, regardless of how many threads are running. The machine has plenty of headroom above that, overall system usage at about 200%.

Now part of my queue implementation (shown below) is choosing an item at random from the queue, since each work item’s execution requires a lock on two elements and similar elements will be near each other in the queue. Thus, I want the threads to pick random indices and attack different bits of the queue to minimise mutex clashes.

Now, I’ve read that my initial attempt with rand() will have been slow because my random numbers weren’t thread safe (does that sentence make sense? not sure…)

I’ve tried the implementation both with random() and with drand48_r (although, unfortunately, the latter seems to be unavailable on OS X,) to no avail with the statistics.

Perhaps someone else can tell me what might be the cause of the problem? the code (worker function) is below, and do shout if you think any of the queue_add functions or constructors might be useful to see too.

void* worker_thread_function(void* untyped_queue) {

  queue_t* queue = (queue_t*)untyped_queue;
  int success = 0;
  int rand_id;
  long int temp;
  work_item_t* work_to_do = NULL;
  int work_items_completed = 0;

  while (1) {
    if (pthread_mutex_lock(queue->mutex)) {

      // error case, try again:
      continue;
    }

    while (!success) {

      if (queue->queue->count == 0) {

        pthread_mutex_unlock(queue->mutex);
        break;
      }

      // choose a random item from the work queue, in order to avoid clashing element mutexes.
      rand_id = random() % queue->queue->count;

      if (!pthread_mutex_trylock(((work_item_t*)queue->queue->items[rand_id])->mutex)) {

        // obtain mutex locks on both elements for the work item.
        work_to_do = (work_item_t*)queue->queue->items[rand_id];

        if (!pthread_mutex_trylock(((element_t*)work_to_do->element_1)->mutex)){ 
          if (!pthread_mutex_trylock(((element_t*)work_to_do->element_2)->mutex)) {

            success = 1;
          } else {

            // only locked element_1 and work item:
            pthread_mutex_unlock(((element_t*)work_to_do->element_1)->mutex);
            pthread_mutex_unlock(work_to_do->mutex);
            work_to_do = NULL;
          }
        } else {

          // couldn't lock element_1, didn't even try 2:
          pthread_mutex_unlock(work_to_do->mutex);
          work_to_do = NULL;
        }
      }
    }

    if (work_to_do == NULL) {
       if (queue->queue->count == 0 && queue->exit_flag) {

        break;
      } else {

        continue;
      }
    }

    queue_remove_work_item(queue, rand_id, NULL, 1);
    pthread_mutex_unlock(work_to_do->mutex);

    pthread_mutex_unlock(queue->mutex);

    // At this point, we have mutex locks for the two elements in question, and a
    // work item no longer visible to any other threads. we have also unlocked the main
    // shared queue, and are free to perform the work on the elements.
    execute_function(
      work_to_do->interaction_function,
      (element_t*)work_to_do->element_1,
      (element_t*)work_to_do->element_2,
      (simulation_parameters_t*)work_to_do->params
    );

    // now finished, we should unlock both the elements:
    pthread_mutex_unlock(((element_t*)work_to_do->element_1)->mutex);
    pthread_mutex_unlock(((element_t*)work_to_do->element_2)->mutex);

    // and release the work_item RAM:
    work_item_destroy((void*)work_to_do);
    work_to_do = NULL;

    work_items_completed++;
    success = 0;
  }
  return NULL;
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T05:31:23+00:00Added an answer on May 27, 2026 at 5:31 am

    It doesn’t seem like random() is your problem, since it is the same code regardless of number of threads. Since performance goes down with number of threads, likely you are getting killed by locking overhead. Do you really need multiple threads? How long does the work function take, and what is your average queue depth? Selecting items randomly seems like a bad idea. Definitely if queue count is <= 2 you don’t need to do the rand calculation. Also, instead of randomly selecting queue index, it would be better to just use a different queue per worker thread and insert in a round-robin fashion. Or, at least something simple like remembering the last index claimed by previous thread and just not picking that one.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have implemented a python webserver. Each http request spawns a new thread. I
I have implemented a linked list as a self-referencing database table: CREATE TABLE LinkedList(
I have an application that has multiple threads processing work from a todo queue.
I have implemented some tutorials of dojo all this tutorials work on html web
The company I work for is looking to implement a caching solution. We have
I have implemented what I thought was a pretty decent representation of MVC in
I have implemented a simple file upload-download mechanism. When a user clicks a file
I have implemented a SAX parser in Java by extending the default handler. The
I have implemented VirtualPathProvider class so I can keep all my views in the
I have implemented tracing based on System.Diagnostics. I am also using a System.Diagnostics.TextWriterTraceListener, and

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.