I’m trying to use texture memory to solve an interpolation problem, hopefully in a

Question

0

Asked: June 16, 20262026-06-16T07:47:33+00:00 2026-06-16T07:47:33+00:00

I’m trying to use texture memory to solve an interpolation problem, hopefully in a

0

I’m trying to use texture memory to solve an interpolation problem, hopefully in a faster way than using global memory. Being the very first time for me to use texture memory, I’m oversimplifying my interpolation problem to a linear interpolation one. So, I’m already aware there are smarter and faster ways to make linear interpolation than the one reported below.
Here is the file Kernels_Interpolation.cuh. The __device__ function linear_kernel_GPU is omitted for simplicity, but is correct.

texture<cuFloatComplex,1> data_d_texture;

__global__ void linear_interpolation_kernel_function_GPU_texture(cuComplex* result_d, float* x_in_d, float* x_out_d, int M, int N)
{    
   int j = threadIdx.x + blockDim.x * blockIdx.x;

   cuComplex datum;

   if(j<N)
   {
       result_d[j] = make_cuComplex(0.,0.);
       for(int k=0; k<M; k++)
       {
           datum = tex1Dfetch(data_d_texture,k);
           if (fabs(x_out_d[j]-x_in_d[k])<1.) result_d[j] = cuCaddf(result_d[j],cuCmulf(make_cuComplex(linear_kernel_GPU(x_out_d[j]-x_in_d[k]),0.),datum));
       }  
   } 
}

Here is the Kernels_Interpolation.cu function

extern "C" void linear_interpolation_function_GPU_texture(cuComplex* result_d, cuComplex* data_d, float* x_in_d, float* x_out_d, int M, int N){

   cudaBindTexture(NULL, data_d_texture, data_d, M);

   dim3 dimBlock(BLOCK_SIZE,1); dim3 dimGrid(N/BLOCK_SIZE + (N%BLOCK_SIZE == 0 ? 0:1),1);
   linear_interpolation_kernel_function_GPU_texture<<<dimGrid,dimBlock>>>(result_d, x_in_d, x_out_d, M, N);

}

Finally, in the main program, the data_d array is allocated and initialized as follows

cuComplex* data_d;      cudaMalloc((void**)&data_d,sizeof(cuComplex)*M);
cudaMemcpy(data_d,data,sizeof(cuComplex)*M,cudaMemcpyHostToDevice);

The result_d array has length N.

The strange thing is that the output is correctly computed only on the first 16 locations, although N>16, the others being 0s e.g.

result.r[0] 0.563585 result.i[0] 0.001251 
result.r[1] 0.481203 result.i[1] 0.584259
result.r[2] 0.746924 result.i[2] 0.820994
result.r[3] 0.510477 result.i[3] 0.708008
result.r[4] 0.362980 result.i[4] 0.091818
result.r[5] 0.443626 result.i[5] 0.984452
result.r[6] 0.378992 result.i[6] 0.011919
result.r[7] 0.607517 result.i[7] 0.599023
result.r[8] 0.353575 result.i[8] 0.448551
result.r[9] 0.798026 result.i[9] 0.780909
result.r[10] 0.728561 result.i[10] 0.876729
result.r[11] 0.143276 result.i[11] 0.538575
result.r[12] 0.216170 result.i[12] 0.861384
result.r[13] 0.994566 result.i[13] 0.993541
result.r[14] 0.295192 result.i[14] 0.270596
result.r[15] 0.092388 result.i[15] 0.377816
result.r[16] 0.000000 result.i[16] 0.000000
result.r[17] 0.000000 result.i[17] 0.000000
result.r[18] 0.000000 result.i[18] 0.000000
result.r[19] 0.000000 result.i[19] 0.000000

The rest of the code is correct, namely, if I replace linear_interpolation_kernel_function_GPU_texture and linear_interpolation_function_GPU_texture with functions using global memory everything is fine.

I have verified that I can correctly access texture memory until a certain location (which depends on M and N), for example 64, after which it returns 0s.

I have the same problem if I replace the cuComplex texture to a float one (forcing the data to be real).

Any ideas?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-16T07:47:34+00:00

Editorial Team

2026-06-16T07:47:34+00:00Added an answer on June 16, 2026 at 7:47 am

I can see one logical error in the following line of your program.

cudaBindTexture(NULL, data_d_texture, data_d, M);

The last argument of cudaBindTexture takes the size of data in bytes and you are specifying the number of elements.

You should try the following:

cudaBindTexture(NULL, data_d_texture, data_d, M * sizeof(cuComplex));

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to use texture memory to solve an interpolation problem, hopefully in a

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply