Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 669207
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T00:07:33+00:00 2026-05-14T00:07:33+00:00

So I realize this question sounds stupid (and yes I am using a dual

  • 0

So I realize this question sounds stupid (and yes I am using a dual core), but I have tried two different libraries (Grand Central Dispatch and OpenMP), and when using clock() to time the code with and without the lines that make it parallel, the speed is the same. (for the record they were both using their own form of parallel for). They report being run on different threads, but perhaps they are running on the same core? Is there any way to check? (Both libraries are for C, I’m uncomfortable at lower layers.) This is super weird. Any ideas?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T00:07:33+00:00Added an answer on May 14, 2026 at 12:07 am

    EDIT: Added detail for Grand Central Dispatch in response to OP comment.

    While the other answers here are useful in general, the specific answer to your question is that you shouldn’t be using clock() to compare the timing. clock() measures CPU time which is added up across the threads. When you split a job between cores, it uses at least as much CPU time (usually a bit more due to threading overhead). Search for clock() on this page, to find “If process is multi-threaded, cpu time consumed by all individual threads of process are added.”

    It’s just that the job is split between threads, so the overall time you have to wait is less. You should be using the wall time (the time on a wall clock). OpenMP provides a routine omp_get_wtime() to do it. Take the following routine as an example:

    #include <omp.h>
    #include <time.h>
    #include <math.h>
    #include <stdio.h>
    
    int main(int argc, char *argv[]) {
        int i, nthreads;
        clock_t clock_timer;
        double wall_timer;
        for (nthreads = 1; nthreads <=8; nthreads++) {
            clock_timer = clock();
            wall_timer = omp_get_wtime();
            #pragma omp parallel for private(i) num_threads(nthreads)
            for (i = 0; i < 100000000; i++) cos(i);
            printf("%d threads: time on clock() = %.3f, on wall = %.3f\n", \
                nthreads, \
                (double) (clock() - clock_timer) / CLOCKS_PER_SEC, \
                omp_get_wtime() - wall_timer);
        }
    }
    

    The results are:

    1 threads: time on clock() = 0.258, on wall = 0.258
    2 threads: time on clock() = 0.256, on wall = 0.129
    3 threads: time on clock() = 0.255, on wall = 0.086
    4 threads: time on clock() = 0.257, on wall = 0.065
    5 threads: time on clock() = 0.255, on wall = 0.051
    6 threads: time on clock() = 0.257, on wall = 0.044
    7 threads: time on clock() = 0.255, on wall = 0.037
    8 threads: time on clock() = 0.256, on wall = 0.033
    

    You can see that the clock() time doesn’t change much. I get 0.254 without the pragma, so it’s a little slower using openMP with one thread than not using openMP at all, but the wall time decreases with each thread.

    The improvement won’t always be this good due to, for example, parts of your calculation that aren’t parallel (see Amdahl’s_law) or different threads fighting over the same memory.

    EDIT: For Grand Central Dispatch, the GCD reference states, that GCD uses gettimeofday for wall time. So, I create a new Cocoa App, and in applicationDidFinishLaunching I put:

    struct timeval t1,t2;
    dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
    for (int iterations = 1; iterations <= 8; iterations++) {
        int stride = 1e8/iterations;
        gettimeofday(&t1,0);
        dispatch_apply(iterations, queue, ^(size_t i) { 
            for (int j = 0; j < stride; j++) cos(j); 
        });
        gettimeofday(&t2,0);
        NSLog(@"%d iterations: on wall = %.3f\n",iterations, \
                    t2.tv_sec+t2.tv_usec/1e6-(t1.tv_sec+t1.tv_usec/1e6));
    }
    

    and I get the following results on the console:

    2010-03-10 17:33:43.022 GCDClock[39741:a0f] 1 iterations: on wall = 0.254
    2010-03-10 17:33:43.151 GCDClock[39741:a0f] 2 iterations: on wall = 0.127
    2010-03-10 17:33:43.236 GCDClock[39741:a0f] 3 iterations: on wall = 0.085
    2010-03-10 17:33:43.301 GCDClock[39741:a0f] 4 iterations: on wall = 0.064
    2010-03-10 17:33:43.352 GCDClock[39741:a0f] 5 iterations: on wall = 0.051
    2010-03-10 17:33:43.395 GCDClock[39741:a0f] 6 iterations: on wall = 0.043
    2010-03-10 17:33:43.433 GCDClock[39741:a0f] 7 iterations: on wall = 0.038
    2010-03-10 17:33:43.468 GCDClock[39741:a0f] 8 iterations: on wall = 0.034
    

    which is about the same as I was getting above.

    This is a very contrived example. In fact, you need to be sure to keep the optimization at -O0, or else the compiler will realize we don’t keep any of the calculations and not do the loop at all. Also, the integer that I’m taking the cos of is different in the two examples, but that doesn’t affect the results too much. See the STRIDE on the manpage for dispatch_apply for how to do it properly and for why iterations is broadly comparable to num_threads in this case.

    EDIT: I note that Jacob’s answer includes

    I use the omp_get_thread_num()
    function within my parallelized loop
    to print out which core it’s working
    on… This way you can be sure that
    it’s running on both cores.

    which is not correct (it has been partly fixed by an edit). Using omp_get_thread_num() is indeed a good way to ensure that your code is multithreaded, but it doesn’t show “which core it’s working on”, just which thread. For example, the following code:

    #include <omp.h>
    #include <stdio.h>
    
    int main() {
        int i;
        #pragma omp parallel for private(i) num_threads(50)
        for (i = 0; i < 50; i++) printf("%d\n", omp_get_thread_num());
    }
    

    prints out that it’s using threads 0 to 49, but this doesn’t show which core it’s working on, since I only have eight cores. By looking at the Activity Monitor (the OP mentioned GCD, so must be on a Mac – go Window/CPU Usage), you can see jobs switching between cores, so core != thread.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I realize that this question is impossible to answer absolutely, but I'm only after
I realize this perhaps a naive question but still I cant figure out how
I realize there's no definitely right answer to this question, but when people talk
Ok so I realize that this is a pretty vague question, but bear with
I realize this question has been asked several times in several different forms ,
I realize this may be a very simple question but I need to know
I realize that the query this question is looking for won't be enough to
I'm writing this question from the standpoint of an ASP.NET application. However I realize
I realise that this is a very basic question, but it is one which
I realize this would violate convention, but I'm curious to know if you can

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.