In a 2-d or 3-d CUDA block, how are threads grouped into warps? My

Question

0

Asked: June 18, 20262026-06-18T16:01:09+00:00 2026-06-18T16:01:09+00:00

In a 2-d or 3-d CUDA block, how are threads grouped into warps? My

0

In a 2-d or 3-d CUDA block, how are threads grouped into warps? My assumption is that they iterate first by x, then y, then z. For example, in threads with <z,y,x>, <0,0,[0-31]> is a warp, and so is <0,1,[0-31]>, etc. Is this correct?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-18T16:01:10+00:00

Yes that is correct. Threads are grouped first by X, then Y, then Z (thread coordinates) when creating warps (groups of 32 threads that execute together). This has implications for good coalescing: you will want to arrange your usage of thread coordinates in matrix subscripts so that warp-adjacent threads (i.e. in X coordinates, typically) will access adjacent elements in the matrix (by using threadIdx.x or a derivative in the most rapidly varying matrix dimension. We typically want data[z][y][x], not data[x][y][z]

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

In a 2-d or 3-d CUDA block, how are threads grouped into warps? My

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply