Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 975419
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T03:34:57+00:00 2026-05-16T03:34:57+00:00

I want to speed up an array multiplication in C99. This is the original

  • 0

I want to speed up an array multiplication in C99.

This is the original for loops:

for(int i=0;i<n;i++) {
        for(int j=0;j<m;j++) {
            total[j]+= w[j][i] * x[i];
        }
    }

My boss asked my to try this, but it did not improve the speed:

for(int i=0;i<n;i++) {
        float value = x[i];
        for(int j=0;j<m;j++) {
            total[j]+= w[j][i] * value;
        }
    }

Have you other ideas (except for openmp, which I already use) on how I could speed up these for-loops?
I am using:

gcc -DMNIST=1 -O3 -fno-strict-aliasing -std=c99 -lm -D_GNU_SOURCE -Wall -pedantic -fopenmp

Thanks!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T03:34:57+00:00Added an answer on May 16, 2026 at 3:34 am

    One of the theories is that testing for zero is faster than testing for j<m. So by looping from j=m while j>0, in theory you could save some nanoseconds per loop. However in recent experience this has made not a single difference to me, so I think this doesn’t hold for current cpu’s.

    Another issue is memory layout: if your inner loop accesses a chunk of memory that isn’t spread out, but continuous, chances are you have more benefit of the lowest cache available in your CPU.

    In your current example, switching the layout of w from w[j][i] to w[i][j] may therefore help. Aligning your values on 4 or 8 bytes boundaries will help as well (but you will find that this is already the case for your arrays)

    Another one is loop-unrolling, meaning that you do your inner loop in chunks of, say, 4. So the evaluation if the loop is done, has to be done 4 times less. The optimum value must be determined emperically, and may also depend on the problem at hand (e.g. if you know you’re looping a multiple of 5 times, use 5)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a simple 2D array in javascript. I want to pass this array
I have the following formula float mean = (r+b+g)/3/255.0f; I want to speed it
I want to execute several inserts at the smame batch to speed up the
I want to go for Windows 7 + Intel SSD drives to speed up
How to calculate approximately the connection speed of a website using JavaScript? I want
Want the function to sort the table by HP but if duplicate HPs then
So, I have a cell-array of 1xN vectors of different lengths. I want to
I want to use the bts and bt x86 assembly instructions to speed up
I have a 2-dimensional array of objects (predominantly, but not exclusively strings) that I
I want to filter a table with an input box. It works but it

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.