I am developing a Windows 64-bit application that will manage concurrent execution of different

Question

0

Asked: June 7, 20262026-06-07T17:58:40+00:00 2026-06-07T17:58:40+00:00

I am developing a Windows 64-bit application that will manage concurrent execution of different

0

I am developing a Windows 64-bit application that will manage concurrent execution of different CUDA-algorithms on several GPUs.

My design requires a way of passing pointers to device memory
around c++ code. (E.g. remember them as members in my c++ objects).
I know that it is impossible to declare class members with __device__ qualifiers.

However I couldn’t find a definite answer whether assigning __device__ pointer to a normal C pointer and then using the latter works. In other words: Is the following code valid?

__device__ float *ptr;
cudaMalloc(&ptr, size);
float *ptr2 = ptr
some_kernel<<<1,1>>>(ptr2);

For me it compiled and behaved correctly but I would like to know whether it is guaranteed to be correct.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T17:58:41+00:00

No, that code isn’t strictly valid. While it might work on the host side (more or less by accident), if you tried to dereference ptr directly from device code, you would find it would have an invalid value.

The correct way to do what your code implies would be like this:

__device__ float *ptr;

__global__ void some_kernel()
{
    float val = ptr[threadIdx.x];
    ....
}

float *ptr2;
cudaMalloc(&ptr2, size);
cudaMemcpyToSymbol("ptr", ptr2, sizeof(float *));

some_kernel<<<1,1>>>();

for CUDA 4.x or newer, change the cudaMemcpyToSymbol to:

cudaMemcpyToSymbol(ptr, ptr2, sizeof(float *));

If the static device symbol ptr is really superfluous, you can just to something like this:

float *ptr2;
cudaMalloc(&ptr2, size);
some_kernel<<<1,1>>>(ptr2);

But I suspect that what you are probably looking for is something like the thrust library device_ptr class, which is a nice abstraction wrapping the naked device pointer and makes it absolutely clear in code what is in device memory and what is in host memory.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am developing a Windows 64-bit application that will manage concurrent execution of different

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply