CUDA implicitly initialises when the first CUDA runtime function is called.
I’m timing the runtime of my code and repeating 100 times via a loop (for([100 times]) {[Time CUDA code and log]}), which also needs to take into account the initialisation time for CUDA at each iteration. Thus I need to uninitialise CUDA after every iteration – how to do this?
I’ve tried using cudaDeviceReset(), but seems not to have uninitialised CUDA.
Many thanks.
cudaDeviceResetis the canonical way to destroy a context in the runtime API (and callingcudaFree(0)is the canonical way to create a context). Those are the only levels of “re-initialization” available to a running process. There are other per-process events which happen when a process loads the CUDA driver and runtime libraries and connects to the kernel driver, but there is no way I am aware of to make those happen programatically short of forking a new process.But I really doubt you want or should be needing to account for this sort of setup time when calculating performance metrics anyway.