I have a vector called d_index calculated in the CUDA device memory and I want to change just one value, like this…
d_index[columnsA-rowsA]=columnsA;
How can I do this without having to copy it to the system memory and then back to the device memory?
You could either call kernel on
<<<1,1>>>grid, that changes only the desired element:, or use something like:
If you only do this once, I think there is no big difference which version to use. If you call this code often, you better consider including this array modification into some other kernel to avoid invocation overhead.