I’m trying to use multiple CUDA devices from multiple OpenMP threads. The devices are

Question

0

Asked: June 14, 20262026-06-14T05:15:23+00:00 2026-06-14T05:15:23+00:00

I’m trying to use multiple CUDA devices from multiple OpenMP threads. The devices are

0

I’m trying to use multiple CUDA devices from multiple OpenMP threads. The devices are initialized (i.e. memory is allocated on them) from the main thread, and then I use cudaSetDevice from different threads to then launch kernels on different devices. Threads are not sharing devices, each thread has exclusive access to its device.

From what I understand, this should work fine. However, as soon as I launch a kernel on a device from an OpenMP thread which is the not the main (i.e. omp_get_thread_num() != 0) I get an “invalid device ordinal error” from CUDA:

kernel<<<...>>>(...);
error = cudaDeviceSynchronize(); // returns cudaSuccess
error = cudaGetLastError(); // returns invalid device ordinal error

Am I missing something? Has anyone seen something like this before? I’m using CUDA 5.0.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-14T05:15:24+00:00

Just to close this issue, this problem was a result of me using cudaGetLastError to try and check for errors after a kernel launch, but not checking the error return value of one previous call. Therefore, it was returning the error code from a call to cudaGetDeviceInfo after the kernel launch which I mistakenly inferred to be coming from the launch itself. If you see this error, I would just advise making sure that you’re checking the error values returned by all previous calls to the CUDA API.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to use multiple CUDA devices from multiple OpenMP threads. The devices are

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply