Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 416631
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T18:29:58+00:00 2026-05-12T18:29:58+00:00

x86 and other architectures provide special atomic instructions (lock, cmpxchg, etc.) that allow you

  • 0

x86 and other architectures provide special atomic instructions (lock, cmpxchg, etc.) that allow you to write ‘lock free’ data structures. But as more and more cores are added, it seems as though the work these instructions will actually have to do behind the scenes will grow (at least to maintain cache coherency?). If an atomic add takes ~100 cycles today on a dual core system, might it take significantly longer on the 80+ core machines of the future? If you’re writing code to last, might it actually be a better idea to use locks even if they’re slower today?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T18:29:58+00:00Added an answer on May 12, 2026 at 6:29 pm

    You are right that topology constraints will, one way or another, increase latency of communication between cores, once the counts start going higher than a couple dozen. I don’t really know what the intentions are of the x86 companies for dealing with that sort of scaling.

    But locks are implemented in terms of atomic operations. So you don’t really win by trying to switch to them, unless they are implemented in a more scalable way than what you would be attempted with your own hand-rolled atomic operations. I think that generally, for single token-like contentions, atomic primitives will always still be the fastest way, regardless of how many cores you have.

    As Cray discovered long time ago, there’s no free lunch here. High level software design, where you try to use potentially contentious resources in as infrequent as possible will always lead to the biggest payout in massively parallelized applications. This means doing as much work as possible as the result of a lock acquisition, but as quickly as possible as well. In extreme situations, this can mean pre-calculating your work on the assumption of a successfully acquired lock, trying to grab it, and just completing as fast as possible on success, otherwise throwing away your work and retrying on fail.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Is there such a thing as an x86 assembler that I can call through
I'm writing a Linux program that currently compiles and works fine on x86 and
I have an x86-64 computer running Linux that I would like to supplement with
I precise that I restrict this question to native development for my x86 (64bits)
I have a windows service exe that is compiled as x86, amd64, and Itanium.
I use lxml and some other third party packages that I download and install
X86 and AMD64 are the most important architectures for many computing environments (desktop, servers,
In x86 GNU Assembler there are different suffixes for memory related operations. E.g.: movb,
If I target the x86 platform for my .NET app, will it run properly
I've built the x86 Boost libraries many times, but I can't seem to build

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.