Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7675511
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 31, 20262026-05-31T16:58:50+00:00 2026-05-31T16:58:50+00:00

This is the code for a matrix multiplication program ex implicit none real ::

  • 0

This is the code for a matrix multiplication

 program ex
    implicit none
    real :: a(256,256),b(256,256),c(256,256),t1,t2
    integer i,j,k,sum
    sum=0

    do j = 1,256
      do i = 1,256
        a(i,j) = 1
        b(i,j) = 1
        c(i,j) = 0.0
      enddo
    enddo

    call cpu_time(t1)
    !$acc region do

    do i=1,256
      do j=1,256
        sum=0
        do k=1,256
          sum=sum+a(i,k)*b(k,j)
          c(i,j)=sum
        end do
      end do
    end do
    !$acc end region
    call cpu_time(t2)
    print*,"cpu time=",t2-t1
    print*,c
  end program ex

When I execute this the execution time is 75 msec when using the accelerator directives and the PGI compiler. But when I run same matrix multiplication with a “cuda fortran” implementation the execution time is only 5msec. So there is big difference even though I used the accelerator directives. So I doubt that my accelerator directives are working properly.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-31T16:58:51+00:00Added an answer on May 31, 2026 at 4:58 pm

    I tried to accelerate your program using very similar accelerator directives OpenHMPP. Note that I switched one your line, that is probably errorneously in the innermost loop. Also note, that I had to advice the compiler of the reduction taking place. Also I renamed the reduction variable, because it shadowed the sum intrinsic function.

    The performance is not good, because of the overheead with starting the GPU kernel and because of the memory transfers. You need orders of magnitude more work for it to be profitable to use GPU.

    For example when I used matrices 2000 x 2000 then the CPU execution time was 41 seconds, but GPU execution time only 8 s.

     program ex
        implicit none
        real :: a(256,256),b(256,256),c(256,256),t1,t2
        integer i,j,k,sm
    
          sm=0
          do j = 1,256
              do i = 1,256
                 a(i,j) = 1
                 b(i,j) = 1
                 c(i,j) = 0.0
              enddo
           enddo
           call cpu_time(t1)
         !$hmpp region, target = CUDA
          !$hmppcg gridify, reduce(+:sm)
          do i=1,256
    
              do j=1,256
    
                   sm=0
                   do k=1,256
    
                       sm=sm+a(i,k)*b(k,j)
                   end do
                   c(i,j)=sm
              end do
          end do
         !$hmpp endregion
          call cpu_time(t2)
          print*,"cpu time=",t2-t1
          print*,sum(c)
    end program ex
    

    edit: it would be probably not to use reduce(+:sm), but just private(sm)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I keep reading and reading this matrix multiplication kernel code and I just don't
I just came a cross this nice code that makes this scatter matrix plot:
This is my code, which allocates space for a matrix of size specified by
According to this feature matrix , Visual Studio 2010 Premium (RC) includes static code
I have code for a matrix multiplication lab using double[,] and I wanted to
I was running the MapReduce Matrix Multiplication program found at http://www.norstad.org/matrix-multiply/index.html . I found
I'm performing matrix multiplication with this simple algorithm. To be more flexible I used
I have this code that I call on each touch event that render an
I'm running a completely parallel matrix multiplication program on a Mac Pro with a
I'm making a matriz multiplication program in OpenMPI, and I got this error message:

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.