I’m wondering where this information comes from. When accessing these vars, am I really accessing a register, or somewhere else? (My guess is that these are register values.)
I wonder if there is any speed benefit to storing them in a register?
__global__ void myKernel(int, float, int*) {
const int reg1= threadIdx.y // gonna use/Rd reg1 some 50 different places
:
:
}
or would it be just as fast to read threadIdx.y some 50 different times?
The built-in variables reside in different locations on different compute capabilities. On more recent devices the information is packed into special purpose registers. In the assembly (cuobjdump -sass) the instruction S2R is used to move the value from a special register to general register. The act of assigning the value to an auto variable does not require the compiler to assign the value to a register for any period of time. The compiler is likely doing the optimal assignment to a register for built-ins.