Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8453563
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T11:48:19+00:00 2026-06-10T11:48:19+00:00

I am using a Tesla C1060 with 1.3 compute capability and nvcc compiler driver

  • 0

I am using a Tesla C1060 with 1.3 compute capability and nvcc compiler driver 4.0. I am trying to do some computation local to thread block. Each thread block is provided with a shared array which is first initialized to zero values. For synchronizing concurrent updates (addition) to shared data by threads of the thread block, I use CUDA atomicAdd primitive.

Once each thread block is ready with the results in its shared data array, each entry in shared data array is iteratively merged (using atomicAdd) to corresponding entries in global data array.

Below is a code very similar to what I basically trying to do.

#define DATA_SZ 16
typedef unsigned long long int ULLInt;

__global__ void kernel( ULLInt* data, ULLInt ThreadCount )
{
  ULLInt thid = threadIdx.x + blockIdx.x * blockDim.x;
  __shared__ ULLInt sharedData[DATA_SZ];

  // Initialize the shared data
  if( threadIdx.x == 0 )
  {
    for( int i = 0; i < DATA_SZ; i++ ) { sharedData[i] = 0; }
  }
  __syncthreads();

  //..some code here

  if( thid < ThreadCount )
  {
    //..some code here

    atomicAdd( &sharedData[getIndex(thid), thid );

    //..some code here        

    for(..a loop...)
    { 
      //..some code here

      if(thid % 2 == 0)
      {           
        // getIndex() returns a value in [0, DATA_SZ )
        atomicAdd( &sharedData[getIndex(thid)], thid * thid );
      }
    }
  }
  __syncthreads();
  
  if( threadIdx.x == 0 )
  {
    // ...
    for( int i = 0; i < DATA_SZ; i++ ) { atomicAdd( &Data[i], sharedData[i] ); }
    //...
  }
}

If I compile with -arch=sm_20 I don’t get any errors. However when I compile the kernel using the -arch=sm_13 option I get the following errors:

ptxas /tmp/tmpxft_00004dcf_00000000-2_mycode.ptx, line error   : Global state space expected for instruction 'atom'
ptxas /tmp/tmpxft_00004dcf_00000000-2_mycode.ptx, line error   : Global state space expected for instruction 'atom'
ptxas fatal   : Ptx assembly aborted due to errors

If I comment out the following two lines I don’t get any errors with -arch=sm_13:

atomicAdd( &sharedData[getIndex(thid), thid );
atomicAdd( &sharedData[getIndex(thid)], thid * thid );

Can someone suggest what I might be doing wrong?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T11:48:20+00:00Added an answer on June 10, 2026 at 11:48 am

    Found the solution in CUDA C programming guide: Atomic functions operating on shared memory and atomic functions operating on 64-bit words are only available for devices of compute capability 1.2 and above. Atomic functions operating on 64-bit words in shared memory are only available for devices of compute capability 2.x and higher.

    So basically I cannot use ULLInt fro shared memory here and somehow I need to use unsigned int

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am using Tesla C2050, which has a compute capability 2.0 and has 48KB
Using C# .NET 3.5 and WCF, I'm trying to write out some of the
I'm trying to do a mega-simple setItem and getItem using HTML5 local storage. It
Using linq2sql I'm trying to take the string in txtOilChange and update the oilChange
Using the navigator.geolocation object in JavaScript. Trying to establish accurate ranges, but wondering exactly
Using Rails 3.2.0 with haml, sass and coffeescript: Basically I am trying to disable
I am trying to port some C++ code from Windows to Solaris(Unix). There are
I am interested in implementing some particle techniques for fluid simulation on GPU's using
I'm using js-test-driver to test my Javascript code on several browsers: TestCase(DropDownValueReplacerTestCase, { setUp:function()
I am using $.getJSON() to pass some data to the server side (PHP, Codeigniter)

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.