Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8772855
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T18:08:19+00:00 2026-06-13T18:08:19+00:00

I have tested the performance of slim reader/writer lock under windows 7 using the

  • 0

I have tested the performance of slim reader/writer lock under windows 7 using the codefrom Windows Via C/C++.

The result surprised me that the exclusive lock out performance the shared one. Here are the code and the result.

unsigned int __stdcall slim_reader_writer_exclusive(void *arg)
{
    //SRWLOCK srwLock;
    //InitializeSRWLock(&srwLock);

    for (int i = 0; i < 1000000; ++i) {
        AcquireSRWLockExclusive(&srwLock);
        g_value = 0;
        ReleaseSRWLockExclusive(&srwLock);
    }
    _endthreadex(0);
    return 0;
}

unsigned int __stdcall slim_reader_writer_shared(void *arg)
{

    int b;
    for (int i = 0; i < 1000000; ++i) {
        AcquireSRWLockShared(&srwLock);
        //b = g_value;
        g_value = 0;
        ReleaseSRWLockShared(&srwLock);
    }
    _endthreadex(0);
    return 0;
}

g_value is a global int volatile variable.

enter image description here

Could you kindly explain why this could happen?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T18:08:21+00:00Added an answer on June 13, 2026 at 6:08 pm

    This is a pretty common result for small general-purpose locks (like SRWLocks, which are only one pointer in size).

    Key Takeaway: If you have an extremely small guarded section of code, such that the overhead of the lock itself might be dominant, an exclusive lock is better to use than a shared lock.

    Also, Raymond Chen’s argument about the contention on g_Value is true as well. If g_Value were read instead of written in both cases, you might notice a benefit for the shared lock.

    Details:

    The SRW lock is implemented using a single pointer-sized atomic variable which can take on a number of different states, depending on the values of the low bits. The description of the way these bits are used is out of scope for this comment–the number of state transitions is pretty high–so, I’ll mention only a few states that you may be encountering in your test.

    Initial lock state: (0, ControlBits:0) — An SRW lock starts with all bits set to 0.

    Shared state: (ShareCount: n, ControlBits: 1) — When there is no conflicting exclusive acquire and the lock is held shared, the share count is stored directly in the lock variable.

    Exclusive state: (ShareCount: 0, ControlBits: 1) — When there is no conflicting shared acquire or exclusive acquire, the lock has a low bit set and nothing else.

    Example contended state: (WaitPtr:ptr, ControlBits: 3) — When there is a conflict, the threads that are waiting for the lock form a queue using data allocated on the waiting threads’ stacks. The lock variable stores a pointer to the tail of the queue instead of a share count.

    In this scheme, trying to acquire an exclusive lock when you don’t know the initial state is a single write to the lock word, to set the low bit and retrieve the old value (this can be done on x86 with a LOCK BTS instruction). If you succeeded (as you always will do in the 1 thread case), you can proceed into the locked region with no further operations.

    Trying to acquire a shared lock is a more involved operation: You need to first read the initial value of the lock variable to determine the old share count, increment the share count you read, and then write the updated value back conditionally with the LOCK CMPXCHG instruction. This is a noticeably longer chain of serially-dependent instructions, so it is slower. Also CMPXCGH is a bit slower on many processors than the unconditional atomic instructions like LOCK BTS.

    It would be possible in theory to speed up the first shared acquire of a the lock by assuming that the lock was in its initial state at the beginning and performing the LOCK CMPXCHG first. This would speed up the initial shared acquire of the lock (all of them in your single-threaded case), but it would pretty significantly slow down the cases where the lock is already held shared and a second shared acquire occurs.

    A similar set of divergent operations occurs when the lock is being released, so the extra cost of managing the shared state is also paid on the ReleaseSRWLockShared side.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have written code that is well tested for performance. in the code I
I have a web app that needs both functionality and performance tested, and part
I have tested the below script on a demo page which is not using
i have tested standardanalyzer with indexWriter and found that it automatically removes stopwords, however,
I have created a Django app and tested the application's performance by populating some
I have tested out hundreds of different codes, either they work and screw something
I have tested this servlet and it works well, except in Google Chrome it
I have tested my application on various mobile phones. My applications main functionality is
I have a secured (https) XML-RPC server written in python, and I have tested
i have made my custom Static Library (.a) successfully. i have tested it in

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.