It seems that printf doesn’t work inside the Kernel of a cuda code
#include "Common.h"
#include<cuda.h>
#include <stdio.h>
__device__ __global__ void Kernel(float *a_d , float *b_d ,int size)
{
int idx = threadIdx.x ;
int idy = threadIdx.y ;
//Allocating memory in the share memory of the device
__shared__ float temp[16][16];
//Copying the data to the shared memory
temp[idy][idx] = a_d[(idy * (size+1)) + idx] ;
printf("idx=%d, idy=%d, size=%d\n", idx, idy, size);
for(int i =1 ; i<size ;i++) {
if((idy + i) < size) { // NO Thread divergence here
float var1 =(-1)*( temp[i-1][i-1]/temp[i+idy][i-1]);
temp[i+idy][idx] = temp[i-1][idx] +((var1) * (temp[i+idy ][idx]));
}
__syncthreads(); //Synchronizing all threads before Next iterat ion
}
b_d[idy*(size+1) + idx] = temp[idy][idx];
}
when compiling, it says:
error: calling a host function("printf") from a __device__/__global__ function("Kernel") is not allowed
The cuda version is 4
Quoting the CUDA Programming Guide “Formatted output is only supported by devices of compute capability 2.x and higher“. See the programming guide for additional information.
Devices of compute capability < 2.x can use cuPrintf.
If you are on a 2.x and above device and you are trying to use printf make sure you have specified arch=sm_20 (or higher). The default is sm_10 which does not have sufficient features to support printf.
NVIDIA offers three source level debuggers for CUDA. You may find these more useful than printf for inspecting variables.
– Nsight Visual Studio Edition CUDA Debugger
– Nsight Eclipse Edition CUDA Debugger
– cuda-gdb