Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6611259
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T19:56:51+00:00 2026-05-25T19:56:51+00:00

My CUDA Kernel doesn’t seem to be changing the values of the arrays I

  • 0

My CUDA Kernel doesn’t seem to be changing the values of the arrays I pass in, here’s the relevant host code:

dim3 grid(numNets, N); 
dim3 threads(1, 1, 1); 

// allocate the arrays and jagged arrays on the device
alloc_dev_memory( state0,  state1,    d_state0, d_state1, 
                  adjlist, d_adjlist, transfer, d_transfer,
                  indeg,   d_indeg, d_N,       d_K,      d_S,          
                  d_Spow,  d_numNets );

// operate on the device memory
kernel<<< grid, threads >>>( d_state0, d_state1, d_adjlist, d_transfer, d_indeg,
                             d_N,      d_K,      d_S,       d_Spow,     d_numNets );

// copy the new states from the device to the host
cutilSafeCall( cudaMemcpy( state0, d_state0, ens_size*sizeof(int), 
                           cudaMemcpyDeviceToHost ) );

// copy the new states from the array to the ensemble
for(int i=0; i < numNets; ++i)
    nets[i]->set_state( state0 + N*i );

Here is the kernel code that is called:

// this dummy kernel just sets all the values to 0 for checking later.
__global__ void kernel( int * state0,    
                        int * state1,
                    int ** adjlist,
                    luint ** transfer,
                        int * indeg,
                        int * d_N,
                    float * d_K,
                        int * d_S,
                    luint * d_Spow,
                        int * d_numNets )
{
    int       N = *d_N;
    luint * Spow = d_Spow;
    int tid = blockIdx.x*N + blockIdx.y;

    state0[tid] = 0;
    state1[tid] = 0;

    for(int k=0; k < indeg[tid]; ++k) {
        adjlist[tid][k] = 0;
    }
    for(int k=0; k < Spow[indeg[tid]]; ++k) {
        transfer[tid][k] = 0;
    }
}

Then, after using cudaMemcpy to get the state0 array back on the host, if I loop through state0 and send all the values to stdout, they are the same as the initial values, even though my kernel is written to set all values to zero.

The expected output should be the initial value of state0: 101111101011, followed by the final value of state0: (all zeros)

A sample run of this code outputs:

101111101011
101111101011

Press ENTER to exit...

The second line should be all zeros. Why isn’t this CUDA kernel affecting the state0 array?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T19:56:52+00:00Added an answer on May 25, 2026 at 7:56 pm

    I found that the values of N and numNets were garbage values. The offset by N was wrong, so the values were being set outside of the array. @pQB, your suggestion was just what I needed.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm using nvcc to compile a CUDA kernel. Unfortunately, nvcc doesn't seem to support
In a CUDA kernel, I have code similar to the following. I am trying
What is the equivalent technique of an assertion in CUDA kernel code? There does
I'm writing a program that requires the following kernel launch: dim3 blocks(16,16,16); //grid dimensions
I am writing a CUDA kernel in which I'm using the string data type
I ran some CUDA code that updated an array of floats. I have a
I was following this: Dynamically allocating memory inside __device/global__ CUDA kernel But it still
I am having a weird problem .. I have written a CUDA code which
I am writing a CUDA kernel for Histogram on a picture, but I had
Kernel launches in CUDA are generally asynchronous, which (as I understand) means that once

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.