Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8995361
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 15, 20262026-06-15T23:36:35+00:00 2026-06-15T23:36:35+00:00

I am very new to cuda and started reading about parallel programming and cuda

  • 0

I am very new to cuda and started reading about parallel programming and cuda just a few weeks ago. After I installed the cuda toolkit, I was browsing the sdk samples (which come with the installation of the toolkit) and wanted to try some of them out. I started with matrixMul from 0_Simple folder. This program executes fine (I am using Visual Studio 2010).
Now I want to change the size of the matrices and try with a bigger one (for example 960X960 or 1024×1024). In this case, something crashes (I get black screen, and then the message: display driver stopped responding and has recovered).

I am changing this two lines in the code (from main function):

    dim3 dimsA(8*4*block_size, 8*4*block_size, 1);
    dim3 dimsB(8*4*block_size, 8*4*block_size, 1);

before they were:

dim3 dimsA(5*2*block_size, 5*2*block_size, 1);
dim3 dimsB(5*2*block_size, 5*2*block_size, 1);

Can someone point to me what I am doing wrong. and should I alter something else in this example for it to work properly. Thx!

Edit: like some of you suggested, i changed the timeout value (0 somehow did not work for me, I set the timeout to 60), so my driver does not crash, but I get huge list of errors, like:
… … …

Error! Matrix[409598]=6.40005159, ref=6.39999986 error term is > 1e-5
Error! Matrix[409599]=6.40005159, ref=6.39999986 error term is > 1e-5

Does this got something to do with the allocation of the memory. Should I make changes there and what could they be?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-15T23:36:36+00:00Added an answer on June 15, 2026 at 11:36 pm

    Your new problem is actually just the strict tolerances provided in the NVidia example. Your kernel is running correctly. It’s just complaining that accumluated error is greater than the limit that they had set for this example. This is just because you’re doing a lot more math operations which are all accumulating error. If you look at the numbers it’s giving you, you’re only off of the reference answer by about 0.00005, which is not unusual after a lot of single-precision floating-point math. The reason you’re getting these errors now and not with the default matrix sizes is that the original matricies were smaller and thus required a lot less operations to multiply. Matrix multiplication of N x N matricies requires on the order of N^3 operations, so the number of operations required increases much faster than the size of the matrix and the accumulated error would increase in proportion with the number of operations.

    If you look near the end of the runTest() function, there’s a call to computeGold() which computes the reference answer on your CPU. There should then be a call to something like shrCompareL2fe that compares the results. The last parameter to this is a tolerance. If you increase the size of this tolerance (say, to 1e-3 or 1e-4 instead of 1e-5,) you should eliminate these error messages. Note that there may be a couple of these calls. The version of the SDK examples that I have has an optional CUBLAS implementation, so it has a comparison for that against the gold, too. The one right after the print statement that says “Comparing CUDA matrixMul & Host results” is the one you’d want to change.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am very new to CUDA programming and was reading the 'CUDA C Programming
Very new to FluentNHibernate, but I'm also excited about the area. I've recently started
Am very new to OpenGL ES framework in iPhone. Am started to study about
Im very new to iOS programming. Im literally just trying to compile an empty
I am very very new to CUDA programming. I am going through the examples
Very new to C++ and Cocos2d-x but I was just toying around with CCArray
Very new to javascript and html-type stuff. I wanted to just make a quick
very new to javascript, but any help to get me started would be appreciated.
Very new to membership provider and just implemented on my new web site. I
Im very new to WP7 Development and i have a few questions. Firstly i

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.