I need to perform a parallel reduction to find the min or max of an array on a CUDA device. I found a good library for this, called Thrust. It seems that you can only perform a parallel reduction on arrays in host memory. My data is in device memory. Is it possible to perform a reduction on data in device memory?
I can’t figure how to do this. Here is documentation for Thrust: http://code.google.com/p/thrust/wiki/QuickStartGuide#Reductions. Thank all of you.
I need to perform a parallel reduction to find the min or max of
Share
You can do reductions in thrust on arrays which are already in device memory. All that you need to do is wrap your device pointers inside
thrust::device_pointercontainers, and call one of the reduction procedures, just as shown in the wiki you have linked to:Note that the return value is also a
device_ptr, so you can use it directly in other kernels usingthrust::raw_pointer_cast: