Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8174675
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 6, 20262026-06-06T22:31:21+00:00 2026-06-06T22:31:21+00:00

The algorithm is described in this paper: Thread Scheduling for Multiprogrammed Multiprocessors . Briefly,

  • 0

The algorithm is described in this paper: Thread Scheduling for Multiprogrammed Multiprocessors. Briefly, a computation is distrubuted in processes and each one has a deque of threads to do the job. A process can push (pop) threads to (from) the bottom of its deque and other processes can work steal from it by popping threads from the top. Thus, work can be dinamically created by the push operation. The algorithm is the following.

Schedulling Algorithm

My question is about the popTop() work stealling function. I don’t think it will work properly for all cases. For example, suppose a process A which has its queue Q and a process B that is trying to steal work from Q, calling popTop(). Suppose also that B is preempted after line 2 of popTop() and localBot = X at this moment. If A runs and popBottom() until the bottom of Q <= X, when B resumes its run it will get a thread that already have been processed by A.

Are my thoughts correct? I need to verify it because I will implement it to do work balancing in a CUDA program.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-06T22:31:22+00:00Added an answer on June 6, 2026 at 10:31 pm

    The code is using cas() (compare and swap) to try and stop the sort of thing you are describing. If popTop() stops after line 2, it stops after having read in age and bot. If popBottom() then runs and returns a thread, it will have incremented fields within age and written the incremented version back into memory using cas(). Now when B resumes and calls cas() the cas() instruction finds that B the values that B has provided for age do not match the values in memory (which means that this call to cas() does not modify memory). So B finds that (oldAge == newAge) and returns ABORT. In these circumstances you would normally try again and hope for better luck next time. The article seems to be saying that calls to yield() are necessary for you to have decent luck, but in any case popTop() should not return a thread somebody else has grabbed.

    There is of course a Wikipedia article on cas() at http://en.wikipedia.org/wiki/Compare-and-swap.

    I would place parallel code using locks one level of difficulty above serial code, and lock-free parallel code one level of difficulty above locking parallel code. I would not write lock-free parallel code unless I knew for sure that I needed performance, and there was no existing known trustworthy code that I could reuse. I would not trust such code until I had tested it exhaustively, and I would actually prefer to have model-checking if possible as well.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am trying to implement an algorithm described in this paper: Decomposition of biospeckle
Motivation: I've seen this algorithm described, and I'd rather not reinvent the wheel if
I'm using the implementation of the algorithm described here: http://www.codezealot.org/archives/55 Using this implementation, when
I'm working on university scheduling problem and using simple genetic algorithm for this. Actually
I am developing an algorithm to reorder packets in a transmission. Each packet has
I'm trying to understand the unification algorithm described in SICP here In particular, in
I'm in search of an algorithm, which can handle the problem described below. I
In Algorithm Design Manual , page 178 describes some properties of Graph, and one
Swanepoel's comment here lead me to this paper . Then, searching for an implementation
I'm trying to improve my current algorithm for the 8 Queens problem, and this

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.