Is there a way to do something like this.
int length = 1000;
float *h_input = new float[length * 100];
size_t bytes = length * 100 * sizeof(float);
cl_mem m_input = clCreateBuffer(context, CL_MEM_READ_WRITE, bytes, NULL, &err);
cl_mem m_output = clCreateBuffer(context, CL_MEM_READ_WRITE, bytes, NULL, &err);
clEnqueueReadBuffer (queue, m_input, true, 0, bytes, h_input, 0, NULL, NULL);
for (int i = 0; i < 100; i++)
{
some_function(length, m_input + i, m_output + i);
}
I have done some naive testing of this and it does not seem to work. This is the error I get.
invalid use of incomplete type 'struct _cl_mem'
Any workarounds for this other than passing i as an extra parameter ? Introducing the extra parameter would need upstream code to be changed all the way to the kernel..
EDIT Added more information for clarity.
The offset for m_input can be worked around by doing clEnqueueReadBuffer with the proper offset (even though it may be costlier than doing a single call). However m_output is reused later so transferring back to the host is not an option.
EDIT My Google skills have failed me.
But I have found the answer by looking at cl.h. clCreateSubBuffer is the answer. There are no answers yet. So I will accept the first answer with sample code using clCreateSubBuffer().
I have found the answer by looking at cl.h.
clCreateSubBufferis the answer. There are no answers yet. So I will accept the first answer with sample code usingclCreateSubBuffer().