I’m curious to know what happens if I set the parameter “count” in cudaMemcpyAsync to zero, i.e. cudaMemcpyAsync(dst, src, count, cudaMemcpyDeviceToHost, stream)? In my code, the function returns cudaSuccess, but is the copy still executed?
Share
Err, yes, zero bytes are copied from source to destination.
But, since there’s no distinction between copying and not copying for that particular size, it’s irrelevant.
If you’re asking if there’s any sort of device-to-host communication for a zero-sized buffer, that’s an implementation detail that is not specified (I wouldn’t bet money on it though since it would be somewhat inefficient).