Code like this:
__constant char a[1] = "x";
...
__local char b[1];
async_work_group_copy(b, a, 1, 0);
throws a compile error:
no instance of overloaded function "async_work_group_copy" matches the argument list
So it seems that this function cannot be used to copy from __constant address space. Am I right? If yes, what’s the preferred method to make a copy of __constant data to __local memory for faster access? Now I use a simple for loop, where each workitem copies several elements.
async_work_group_copy()is defined to copy between local and global memory only (see here: http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/).As far as I know, there is no method to perform bulk copy from constant to local memory. Maybe the reason is that constant memory is actually cached on all GPUs that I know of, which essentially means that it works at the same speed as local memory.
The
vloadn()family of functions can load whole vectors for all types of memory, including constant, so that may partially match what you need. However, it is not bulk copy.