I have copied a hello world program in cuda using this site http://code.google.com/p/stanford-cs193g-sp2010/wiki/TutorialHelloWorld
The code is
#include "util/cuPrintf.cu"
#include <stdio.h>
__global__ void device_greetings(void)
{
cuPrintf("Hello, world from the device!\n");
}
int main(void)
{
// greet from the host
printf("Hello, world from the host!\n");
// initialize cuPrintf
cudaPrintfInit();
// launch a kernel with a single thread to greet from the device
device_greetings<<<1,1>>>();
// display the device's greeting
cudaPrintfDisplay();
// clean up after cuPrintf
cudaPrintfEnd();
return 0;
}
Then compiled using nvcc hello_world.cu -o hello_world however I only see the hello fom host and not the device.
I even tried
printf("{CudaPrintfInt => %s}\n",cudaGetErrorString(cudaPrintfInit()));
printf("{cudaPrintfDisplay => %s}\n",cudaGetErrorString(cudaPrintfDisplay(stdout, true)));
and compiled with nvcc -arch=sm_11 hello_world.cu -o hello_world, However I get:
$ ./hello_world
Hello, world from the host!
{CudaPrintfInt => initialization error}
{cudaPrintfDisplay => __global__ function call is not configured}
$
The graphic model is :
$/sbin/lspci -v | grep VGA
07:01.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a) (prog-if 00 [VGA controller])
and the cuda version is 4:
$ ls /usr/local/cuda/lib/
libcublas.so libcudart.so.4.0.17 libcurand.so.4 libnpp.so
libcublas.so.4 libcufft.so libcurand.so.4.0.17 libnpp.so.4
libcublas.so.4.0.17 libcufft.so.4 libcusparse.so libnpp.so.4.0.17
libcudart.so libcufft.so.4.0.17 libcusparse.so.4
libcudart.so.4 libcurand.so libcusparse.so.4.0.17
“If you are on a CC 2.0 GPU, you don’t need cuPrintf at all — CUDA has printf built-in for CC-2.0 and higher GPUs. So just replace your call to cuPrintf for the actual prinft” (source)
Put you code this way just to check what is causing this problem.
Here say that happen because :
“The device function being invoked (usually via cudaLaunch()) was not previously configured via the cudaConfigureCall() function.”