Say I have a matrix with a dimension of A*B on GPU, where B

Question

0

Asked: June 15, 20262026-06-15T18:12:04+00:00 2026-06-15T18:12:04+00:00

Say I have a matrix with a dimension of A*B on GPU, where B

0

Say I have a matrix with a dimension of A*B on GPU, where B (number of columns) is the leading dimension assuming a C style. Is there any method in CUDA (or cublas) to transpose this matrix to FORTRAN style, where A (number of rows) becomes the leading dimension?

It is even better if it could be transposed during host->device transfer while keep the original data unchanged.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T18:12:06+00:00

The CUDA SDK includes a matrix transpose, you can see here examples of code on how to implement one, ranging from a naive implementation to optimized versions.

For example:

Naïve transpose

__global__ void transposeNaive(float *odata, float* idata,
int width, int height, int nreps)
{
    int xIndex = blockIdx.x*TILE_DIM + threadIdx.x;
    int yIndex = blockIdx.y*TILE_DIM + threadIdx.y;
    int index_in = xIndex + width * yIndex;
    int index_out = yIndex + height * xIndex;

    for (int r=0; r < nreps; r++)
    {
        for (int i=0; i<TILE_DIM; i+=BLOCK_ROWS)
        {
          odata[index_out+i] = idata[index_in+i*width];
        }
    }
}

Like talonmies had point out you can specify if you want operate the matrix as transposed or not, in cublas matrix operations eg.: for cublasDgemm() where C = a * op(A) * op(B) + b * C, assuming you want to operate A as transposed (A^T), on the parameters you can specify if it is (‘N’ normal or ‘T’ transposed)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Say I have a matrix with a dimension of A*B on GPU, where B

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply