I have a simple kernel without using multiple events, and i want to create a CPU version of it which i’ve done and measure the difference between them. I don’t know if events are strictly created for CUDA, but i guess my example is simple enough and does not contain anything to be ok to do that. Opinions?
Share
If you are measuring time on the CPU, nothing is better than High performance counters E.g for java, you can measure time in the nano second range. Events are generally used for the GPU as the start and stop event are added to the GPU queue, not the CPU one.