I would like to copy float2 values back to CPU. The results are correct in GPU side but some how the results are incorrect in CPU side. Can someone please help me
GPU code
#pragma OPENCL EXTENSION cl_amd_printf : enable
__kernel void matM(__global float* input, int width, int height, __global float2* output){
int X = get_global_id(0);
float2 V;
V.x = input [X];
V.y = input [X];
output[X] = V;
printf("%f\t %f\n",output[X].x,output[X].y);
}
CPU code
output = clCreateBuffer(context, CL_MEM_WRITE_ONLY, sizeof(cl_float2) * wid*ht, NULL, NULL);
clEnqueueReadBuffer( commands, output,CL_TRUE, 0, sizeof(cl_float2) * wid *ht, results, 0, NULL, NULL );
The printf inside GPU kernel prints correct results but the host side results are incorrect.
Thanks for helping
cl_float2datatype can be used on host side to accessfloat2data,but my problem was something else.
There was a mismatch in global ids,
I had two global ids and in line 3 should have been
int X = get_global_id(0) + get_global_id(1).