In CUDA there are __ballot(), __any(), __all(), __popc() and a bunch of lanemask functions to perform warp voting operations across all lanes (usually with the size of 32) within a warp. I’m wondering is there any such functions implemented in OpenCL to perform the same operations within one wavefront. If there is no such function, I may need to implement them as inline functions myself to use in my project.
In CUDA there are __ballot() , __any() , __all() , __popc() and a bunch
Share
According to the OpenCL v. 1.1 specification, section 6.11 “Built-in Functions”, I believe that the answer is no.
However on NVIDIA GPUs, you can probably use inline PTX to implement these things (or at least this blogger was able to use inline PTX).