In CUDA when we talk about parallel threads executing the same code is there any order to their execution?
For-example:
If, I have 4 threads,for a 1D array of 4 elements.All four threads perfom some operation on some index of the array.
Will thread 4 always execute after thread 3 or there is no specific order in the execution?
Thank you!
Generally, there are no order in threads execution. It’s wrong to rely on the order of threads designing your algorithm.