Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 153957
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 11, 20262026-05-11T09:53:37+00:00 2026-05-11T09:53:37+00:00

On my laptop with Intel Pentium dual-core processor T2370 (Acer Extensa) I ran a

  • 0

On my laptop with Intel Pentium dual-core processor T2370 (Acer Extensa) I ran a simple multithreading speedup test. I am using Linux. The code is pasted below. While I was expecting a speedup of 2-3 times, I was surprised to see a slowdown by a factor of 2. I tried the same with gcc optimization levels -O0 … -O3, but everytime I got the same result. I am using pthreads. I also tried the same with only two threads (instead of 3 threads in the code), but the performance was similar.

What could be the reason? The faster version took reasonably long – about 20 secs – so it seems is not an issue of startup overhead.

NOTE: This code is a lot buggy (indeed it does not make much sense as the output of serial and parallel versions would be different). The intention was just to ‘get’ a speedup comparison for the same number of instructions.

#include <stdio.h> #include <time.h> #include <unistd.h> #include <pthread.h>  class Thread{     private:             pthread_t thread;             static void *thread_func(void *d){((Thread *)d)->run();}     public:             Thread(){}             virtual ~Thread(){}              virtual void run(){}             int start(){return pthread_create(&thread, NULL, Thread::thread_func, (void*)this);}             int wait(){return pthread_join(thread, NULL);} };   #include <iostream>  const int ARR_SIZE = 100000000; const int N = 20; int arr[ARR_SIZE];  int main(void) {      class Thread_a:public Thread{             public:                     Thread_a(int* a): arr_(a) {}                     void run()                     {                             for(int n = 0; n<N; n++)                             for(int i=0; i<ARR_SIZE/3; i++){ arr_[i] += arr_[i-1];}                     }             private:                     int* arr_;     };     class Thread_b:public Thread{             public:                     Thread_b(int* a): arr_(a) {}                     void run()                     {                             for(int n = 0; n<N; n++)                             for(int i=ARR_SIZE/3; i<2*ARR_SIZE/3; i++){ arr_[i] += arr_[i-1];}                     }             private:                     int* arr_;     };      class Thread_c:public Thread{             public:                     Thread_c(int* a): arr_(a) {}                     void run()                     {                             for(int n = 0; n<N; n++)                             for(int i=2*ARR_SIZE/3; i<ARR_SIZE; i++){ arr_[i] += arr_[i-1];}                     }             private:                     int* arr_;     };      {             Thread *a=new Thread_a(arr);             Thread *b=new Thread_b(arr);             Thread *c=new Thread_c(arr);              clock_t start = clock();              if (a->start() != 0) {                     return 1;             }              if (b->start() != 0) {                     return 1;             }             if (c->start() != 0) {                     return 1;             }              if (a->wait() != 0) {                     return 1;             }              if (b->wait() != 0) {                     return 1;             }              if (c->wait() != 0) {                     return 1;             }              clock_t end = clock();             double duration = (double)(end - start) / CLOCKS_PER_SEC;              std::cout << duration << 'seconds\n';             delete a;             delete b;      }     {             clock_t start = clock();             for(int n = 0; n<N; n++)             for(int i=0; i<ARR_SIZE; i++){ arr[i] += arr[i-1];}             clock_t end = clock();             double duration = (double)(end - start) / CLOCKS_PER_SEC;              std::cout << 'serial: ' << duration << 'seconds\n';     }      return 0;   } 

See also: What can make a program run slower when using more threads?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-11T09:53:37+00:00Added an answer on May 11, 2026 at 9:53 am

    The times you are reporting are measured using the clock function:

    The clock() function returns an approximation of processor time used by the program.

    $ time bin/amit_kumar_threads.cpp 6.62seconds serial: 2.7seconds  real    0m5.247s user    0m9.025s sys 0m0.304s 

    The real time will be less for multiprocessor tasks, but the processor time will typically be greater.

    When you use multiple threads, the work may be done by more than one processor, but the amount of work is the same, and in addition there may be some overhead such as contention for limited resources. clock() measures the total processor time, which will be the work + any contention overhead. So it should never be less than the processor time for doing the work in a single thread.

    It’s a little hard to tell from the question whether you knew this, and were surprised that the value returned by clock() was twice that for a single thread rather than being only a little more, or you were expecting it to be less.

    Using clock_gettime() instead (you’ll need the realtime library librt, g++ -lrt etc.) gives:

    $ time bin/amit_kumar_threads.cpp 2.524 seconds serial: 2.761 seconds  real    0m5.326s user    0m9.057s sys 0m0.344s 

    which still is less of a speed-up than one might hope for, but at least the numbers make some sense.

    100000000*20/2.5s = 800Hz, the bus frequency is 1600 MHz, so I suspect with a read and a write for each iteration (assuming some caching), you’re memory bandwidth limited as tstenner suggests, and the clock() value shows that most of the time some of your processors are waiting for data. (does anyone know whether clock() time includes such stalls?)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a laptop running Ubuntu, it has a 32 Bits processor (Intel Core
My laptop: Intel Core 2 Duo CPU, 2GHz, 1GB RAM. I created a target
I have a laptop with an intel i3 processor running windows 7 64-bit. I
I have a very simple Toshiba Laptop with i3 processor. Also, I do not
is it possible to run Visual-studio 2008 on mini laptop ? procesor: Intel ®
I use a laptop +a widescreen monitor, as dual screen setup. Half the time
My current laptop has Intel Core2 Duo P9500 @2.53GHz with 4GB memory. Running android
I'm building an online store using PHP and MySQL and I ran into a
My laptop had an install error with Vista Ultimate and now it does not
My laptop is rally showing it's age, and last night after trying out the

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.