Let
import pyopencl as cl
import pyopencl.array as cl_array
import numpy
a = numpy.random.rand(50000).astype(numpy.float32)
mf = cl.mem_flags
What is the difference between
a_gpu = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)
and
a_gpu = cl_array.to_device(self.ctx, self.queue, a)
?
And what is the difference between
result = numpy.empty_like(a)
cl.enqueue_copy(self.queue, result, result_gpu)
and
result = result_gpu.get()
?
Buffers are CL’s version of
malloc, whilepyopencl.array.Arrayis a workalike of numpy arrays on the compute device.So for the second version of the first part of your question, you may write
a_gpu + 2to get a new arrays that has 2 added to each number in your array, whereas in the case of theBuffer, PyOpenCL only sees a bag of bytes and cannot perform any such operation.The second part of your question is the same in reverse: If you’ve got a PyOpenCL array,
.get()copies the data back and converts it into a (host-based) numpy array. Since numpy arrays are one of the more convenient ways to get contiguous memory in Python, the second variant withenqueue_copyalso ends up in a numpy array–but note that you could’ve copied this data into an array of any size (as long as it’s big enough) and any type–the copy is performed as a bag of bytes, whereas.get()makes sure you get the same size and type on the host.Bonus fact: There is of course a Buffer underlying each PyOpenCL array. You can get it from the
.dataattribute.