I am trying to run a kernel on the GPU and do additional computation on the host (CPU). I see this effect:
only the kernel needs around 2000 ms:
clEnqueueNDRangeKernel …
clFinish (or clWaitForEvents, I tried both)
I simulated additional computation on the CPU with sleep(10):
clEnqueueNDRangeKernel …
sleep(10);
clFinish (or clWaitForEvents)
In theory the kernel should run on GPU and after the 10 sec sleep the kernel should be finished. But time measuring said it all needs 12000ms instead of 10000.
Does clFinish or clWaitForEvents invoke the kernel to start or did I miss something?
I’m using an AMD Fusion CPU/GPU und Linux.
Thanks a lot.
Try calling
clFlushright afterclEnqueueNDRangeKernel:http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/clFlush.html