I’m using Visual Studio 2010 and a GTX480 with compute capability 2.0.
I have tried setting sm to 2.0, but when I attempt to use printf() in a kernel, I get:
error : calling a host function(“printf”) from a __device__/__global__
function(“test”) is not allowed
This is my code:
#include "util\cuPrintf.cu"
#include <cuda.h>
#include <iostream>
#include <stdio.h>
#include <conio.h>
#include <cuda_runtime.h>
__global__ void test (void)
{
printf("Hello, world from the device!\n");
}
void main(void)
{
test<<<1,1>>>();
getch();
}
I find a example here: “CUDA_C_Programming_Guide” ‘page _106’ “B.16.4 Examples”
at last,it is work for me 😀 thank you.
#include "stdio.h"
#include <conio.h>
// printf() is only supported
// for devices of compute capability 2.0 and higher
#if defined(__CUDA_ARCH__) && (__CUDA_ARCH__ < 200)
#define printf(f, ...) ((void)(f, __VA_ARGS__),0)
#endif
__global__ void helloCUDA(float f)
{
printf("Hello thread %d, f=%f\n", threadIdx.x, f);
}
int main()
{
helloCUDA<<<1, 5>>>(1.2345f);
cudaDeviceSynchronize();
getch();
return 0;
}
To use
printfin kernel code, you have to do three things:cstdioorstdio.hare included in the kernel compilation unit. CUDA implements kernelprintfby overloading, so you must include that file-arch=sm_20to nvcc or the IDE equivalent in Visual Studio or Nsight Eclipse edition)cudaDeviceSynchronizefor example).