I have an algorithm which consists two major tasks. Both tasks are embarrassingly parallel.

Question

0

Asked: June 10, 20262026-06-10T08:00:42+00:00 2026-06-10T08:00:42+00:00

I have an algorithm which consists two major tasks. Both tasks are embarrassingly parallel.

0

I have an algorithm which consists two major tasks. Both tasks are embarrassingly parallel. So I can port this algorithm on CUDA by one of the following way.

>Kernel<<<
Block,Threads>>>()  \\\For task1  
cudaThreadSynchronize();  
>Kerne2<<<
Block,Threads>>>()  \\\For task2

Or I can do following thing.

>Kernel<<<
Block,Threads>>>()  
{  
    1.Threads work on task 1.  
    2.syncronizes across device.  
    3.Start for task 2.  
}

One can note that in first method, we’ll have to come back to CPU while in second trend we’ll have to use synchronization across all blocks in CUDA. Paper in IPDPS 10 says that second method, with proper care can perform better. But in general which method should be followed?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-10T08:00:43+00:00

There is not currently any officially supported method for synchronizing across thread blocks withing a single kernel execution in the CUDA programming model. Methods of doing so, in my experience, lead to brittle code that can lead to incorrect behavior under changing circumstances such as running on different hardware, changing driver and CUDA release versions, etc.

Just because something is published in an academic publication does not mean it is a safe idea for production code.

I recommend you stick with your method 1, and I ask you this: have you determined that separating your computation into two separate kernels is really causing a performance problem? Is the cost of a second kernel launch definitely the bottleneck?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have an algorithm which consists two major tasks. Both tasks are embarrassingly parallel.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply