I have a sorted integer array on the device, e.g.:
[0,0,0,1,1,2,2]
And I want the offsets to each element in another array:
[0,3,5]
(since the first 0 is at position 0, the first 1 at position 3 and so on)
I know how many different elements there will be beforehand. How would you implement this efficiently in CUDA? I’m not asking for code, but a high level description of the algorithm you would implement to compute this transformation. I already hat a look at the various functions in the thrust name space, but could not think of any combination of thrust functions to achieve this. Also, does this transformation have a widely accepted name?
You can solve this in Thrust using
thrust::unique_by_key_copywiththrust::counting_iterator. The idea is to treat your integer array as thekeysargument tounique_by_key_copyand to use a sequence of ascending integers (i.e.,counting_iterator) as thevalues.unique_by_key_copywill compact the values array into the indices of each uniquekey:Here’s the output: