Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7860929
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 2, 20262026-06-02T22:29:43+00:00 2026-06-02T22:29:43+00:00

I wrote this code for Matrix multiplication in SIMD which i was able to

  • 0

I wrote this code for Matrix multiplication in SIMD which i was able to compile in Visual Studio, but now I need to compile it in Ubuntu using gcc/g++.

Which commands should I use to compile this? Do I need to make any changes to the code itself?

#include <stdio.h>
#include <stdlib.h>
#include <xmmintrin.h>
#include <iostream>
#include <conio.h>
#include <math.h>
#include <ctime>

using namespace std;

#define MAX_NUM 1000
#define MAX_DIM 252

int main()
{
    int l = MAX_DIM, m = MAX_DIM, n = MAX_DIM;
    __declspec(align(16)) float a[MAX_DIM][MAX_DIM], b[MAX_DIM][MAX_DIM],c[MAX_DIM][MAX_DIM],d[MAX_DIM][MAX_DIM];

    srand((unsigned)time(0));

    for(int i = 0; i < l; ++i)
    {
        for(int j = 0; j < m; ++j)
        {
            a[i][j] = rand()%MAX_NUM;
        }
    }

    for(int i = 0; i < m; ++i)
    {
        for(int j = 0; j < n; ++j)
        {
            b[i][j] = rand()%MAX_NUM;
        }
    }

    clock_t Time1 = clock();

    for(int i = 0; i < m; ++i)
    {
        for(int j = 0; j < n; ++j)
        {
            d[i][j] = b[j][i];
        }
    }

    for(int i = 0; i < l; ++i)
    {
        for(int j = 0; j < n; ++j)
        {
            __m128 *m3 = (__m128*)a[i];
            __m128 *m4 = (__m128*)d[j];
            float* res;
            c[i][j] = 0;
            for(int k = 0; k < m; k += 4)
            {
                __m128 m5 = _mm_mul_ps(*m3,*m4);
                res = (float*)&m5;
                c[i][j] += res[0]+res[1]+res[2]+res[3];
                m3++;
                m4++;
            }
        }
        //cout<<endl;
    }

    clock_t Time2 = clock();
    double TotalTime = ((double)Time2 - (double)Time1)/CLOCKS_PER_SEC;
    cout<<"Time taken by SIMD implmentation is "<<TotalTime<<"s\n";

    Time1 = clock();

    for(int i = 0; i < l; ++i)
    {
        for(int j = 0; j < n; ++j)
        {
            c[i][j] = 0;
            for(int k = 0; k < m; k += 4)
            {
                c[i][j] += a[i][k] * b[k][j];
                c[i][j] += a[i][k+1] * b[k+1][j];
                c[i][j] += a[i][k+2] * b[k+2][j];
                c[i][j] += a[i][k+3] * b[k+3][j];

            }
        }
    }

    Time2 = clock();
    TotalTime = ((double)Time2 - (double)Time1)/CLOCKS_PER_SEC;
    cout<<"Time taken by normal implmentation is "<<TotalTime<<"s\n";

    getch();
    return 0;
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-02T22:29:48+00:00Added an answer on June 2, 2026 at 10:29 pm

    You need to enable SSE, e.g.

    $ g++ -msse3 -O3 -Wall -lrt foo.cpp -o foo
    

    You will also need to change:

    declspec(align(16))
    

    which is Windows-specific, to the more portable:

    __attribute__ ((aligned(16)))
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I wrote this simple C code and compiled it using Visual Studio 2010, with
I wrote this code. The constructor works normally, but in the destructor I get
I wrote this little code std::map<int,template<class T>> map_; map_.insert(make_pair<int,message>(myMsg.id,myMsg)); but the compiler doesn't seem
I need to make 4 forks 1000 times. I wrote this but it runs
I wrote this matrix addition program and I dont know why but I keep
First a little intro: Last year i wrote this http://dragan.yourtree.org/code/canvas-3d-graph/ Now, i want to
Hei! I need to optimize some matrix multiplication code in c, and I'm doing
I wrote this code to overload the unary operator- on a matrix class: const
I wrote this code in C# to encrypt a string with a key: private
I wrote this code that compiles on Solaris gcc it works fine too for

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.