I am trying to implement a structure that holds arrays of data
and I want to implement dynamic array, something like:
struct myStruct {
float3 *data0, *data1;
};
__global__ void kernel(myStruct input) {
unsigned int N = 2;
while(someStatements) {
data0 = new float3[N];
// do somethings
N *= 2;
}
}
How can I do something like this in a CUDA kernel?
If you are going to run this code on either a compute capability 2.x or 3,x device, with a recent version of CUDA, your kernel code is very nearly correct. The C++
newoperator is supported in CUDA 4.x and 5.0 on Fermi and Kepler hardware. Note that memory which is allocated usingnewormallocis allocated on runtime heap on the device. It has the lifespan of the context in which is was created, but you currently cannot directly access it from the CUDA host API (so viacudaMemcpyor similar).I turned your structure and kernel into a simple example code which you can try for yourself to see how it works:
A few points to note:
nvcc -arch=sm_30 -Xptxas="-v" -o dynstruct dynstruct.cuto compile for a GTX 670 on linux)cudaMemcpynot being able to copy directly from addresses in runtime heap memory. I was hoping this might be fixed in CUDA 5.0, but the most recent release candidate still has this restriction.