I converted a program from IDL into CUDA that performs some calculations on a 256X256Xn cube of densities and renders a 2-D image.The program works correctly, but all the pre-processing is still done in IDL (such as reading in the density cube, etc) and passes that info to a wrapper function (using call_external to a C program), that then calls CUDA.
Currently I am trying to optimize the program and would like to use NVIDIA Visual Profiler to check my coalescence, and was wondering if there was a way to do this…a way to get the visual profiler to run when we call the CUDA part of the program?
I currently can’t test anything because there are way too many variables to just hard-code into the CUDA function, but without those values passed in from IDL to C to CUDA it cannot run.
I do have it set up so I can run the IDL, have it stop and then manually call the C wrapper function instead of just running the IDL and having it automatically do everything.
Thanks
You can launch the application from the Visual Profiler. It will only profile the CUDA calls anyway. Optionally, you can use the start and stop profiling buttons to control when it begins and ends profiling. It’s pretty simple, but generally applicable.