I understand that in CUDA’s memory hierachy, we have things like shared memory, texture memory, constant memory, registers and of course the global memory which we allocate using cudaMalloc().
I’ve been searching through whatever documentations I can find but I have yet to come across any that explicitly explains what is the global memory.
I believe that the global memory allocated is on the GDDR of graphics card itself and not the RAM that is shared with the CPU since one of the documentations did state that the pointer cannot be dereferenced by the host side. Am I right?
Global memory is a virtual address space that can be mapped to device memory (memory on the graphics card) or page-locked (pinned) host memory. The latter requires CC > 1.0.
Local, constant, texture, and local memory are allocated in global memory but accessed through different address spaces and caches.
On CC > 2.0 the generic address space allows mapping of shared memory into the global address space; however, shared memory always resides in per SM on-chip memory.