As I know, CUDA has a stream function. It make it possible that memory transportation and kernel execution run in the same. Of course, the data in memory transportation and kernel execution is different. Can I do this things with OpenCL. Beacuse sometime when you do some processing on video. the bottleneck is the memory transportation .
Share
Yes, you can overlap memory operations and kernel execution in OpenCL. Just set the
blocking_readparameter of theclEnqueueReadBUfferfunction toCL_FALSE. But you need to make sure that the transfer has been completed before you operate on the data. Use events for that.