I am currently trying to create a library with CUDA routines but I am

Question

0

Asked: June 9, 20262026-06-09T13:57:36+00:00 2026-06-09T13:57:36+00:00

I am currently trying to create a library with CUDA routines but I am

0

I am currently trying to create a library with CUDA routines but I am running into trouble. I will explain my problems using a rather minimal example, my actual library will be larger.

I have successfully written test.cu, a source file containing a __global__ CUDA function and a wrapper around it (to allocate and copy memory). I can also successfully compile this file into a shared library using the following commands:

nvcc -c test.cu -o test.o -lpthread -lrt -lcuda -lcudart -Xcompiler -fPIC
gcc -m64 -shared -fPIC -o libtest.so test.o -lpthread -lrt -lcuda -lcudart -L/opt/cuda/lib64

The resulting libtest.so exports all my needed symbols.

I now compile my purely C main.c and link it against my library:

gcc -std=c99 main.c -o main -lpthread -ltest -L.

This step is also successful, but upon executing ./main all CUDA functions that are called return an error:

test.cu:17:cError(): cudaGetDeviceCount: [38] no CUDA-capable device is detected
test.cu:17:cError(): cudaMalloc: [38] no CUDA-capable device is detected
test.cu:17:cError(): cudaMemcpy: [38] no CUDA-capable device is detected
test.cu:17:cError(): cudaMemcpy: [38] no CUDA-capable device is detected
test.cu:17:cError(): cudaFree: [38] no CUDA-capable device is detected

(Error messages are created through a debugging function of my own)

During my initial steps I encountered the exact same problem, as I was directly creating an executable from test.cu, because I forgot to link against libpthread (-lpthread). But, as you can see above, I have linked all source files against libpthread. According to ldd, both libtest.so and main depend on libpthread, as it should be.

I am using CUDA 5 (yes, I do realize it is a beta) with gcc 4.6.3 and nvidia driver version 302.06.03 on ArchLinux.

Some help in solving this problem would be more than appreciated!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-09T13:57:37+00:00

Here’s a trivial example…

// File: test.cu
#include <stdio.h>

__global__ void myk(void)
{
    printf("Hello from thread %d block %d\n", threadIdx.x, blockIdx.x);
}

extern "C"
void entry(void)
{
    myk<<<1,1>>>();
    printf("CUDA status: %d\n", cudaDeviceSynchronize());
}

Compile/link with nvcc -m64 -arch=sm_20 -o libtest.so --shared -Xcompiler -fPIC test.cu.

// File: main.c
#include <stdio.h>

void entry(void);

int main(void)
{
    entry();
}

Compile/link with gcc -std=c99 -o main -L. -ltest main.c.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am currently trying to create a library with CUDA routines but I am

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply