Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7975703
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T08:42:58+00:00 2026-06-04T08:42:58+00:00

I want to have a 3d float array in CUDA, here is my code:

  • 0

I want to have a 3d float array in CUDA, here is my code:

#define  SIZE_X 128 //numbers in elements
#define  SIZE_Y 128
#define  SIZE_Z 128
typedef float  VolumeType;
cudaExtent volumeSize = make_cudaExtent(SIZE_X, SIZE_Y, SIZE_Z); //The first argument should be SIZE_X*sizeof(VolumeType)??

float *d_volumeMem;
cutilSafeCall(cudaMalloc((void**)&d_volumeMem, SIZE_X*SIZE_Y*SIZE_Z*sizeof(float)));

.....//assign value to d_volumeMem in GPU

cudaArray *d_volumeArray = 0;
cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<VolumeType>();
cutilSafeCall( cudaMalloc3DArray(&d_volumeArray, &channelDesc, volumeSize) ); 
cudaMemcpy3DParms copyParams = {0};
copyParams.srcPtr = make_cudaPitchedPtr((void*)d_volumeMem, SIZE_X*sizeof(VolumeType), SIZE_X, SIZE_Y); //
copyParams.dstArray = d_volumeArray;
copyParams.extent = volumeSize;
copyParams.kin = cudaMemcpyDeviceToDevice;
cutilSafeCall( cudaMemcpy3D(&copyParams) ); 

Actually, my program runs well. But I’m not sure the result is right. Here is my problem, in the CUDA liberay, it said that the first parameter of make_cudaExtent is “Width in bytes” and the other two is height and depth in elements. So I think in my code above, the fifth line should be

cudaExtent volumeSize = make_cudaExtent(SIZE_X*sizeof(VolumeType), SIZE_Y, SIZE_Z); 

But in this way, there would be error “invalid argument” in cutilSafeCall( cudaMemcpy3D(&copyParams) ); Why?

And another puzzle is the strcut cudaExtent, as CUDA library stated,its component width stands for “Width in elements when referring to array memory, in bytes when referring to linear memory”. So I think in my code when I refer volumeSize.width it should be number in elements. However, if I use

 cudaExtent volumeSize = make_cudaExtent(SIZE_X*sizeof(VolumeType), SIZE_Y, SIZE_Z); 

The volumeSize.width would be SIZE_X*sizeof(VolumeType)(128*4), that is number in bytes instead of number in elements.

In many CUDA SDK, they use char as the VolumeType, so they just use SIZE_X as the first argument in make_cudaExtent. But mine is float, so, anyone could tell me which is the right way to create a cudaExtent if I need to use this to create a 3D array?? Thanks a lot!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T08:43:00+00:00Added an answer on June 4, 2026 at 8:43 am

    Let’s review what the documentation for cudaMemcpy3D says:

    The extent field defines the dimensions of the transferred area in
    elements. If a CUDA array is participating in the copy, the extent is
    defined in terms of that array’s elements. If no CUDA array is
    participating in the copy then the extents are defined in elements of
    unsigned char.

    and similarly the documentation for cudaMalloc3DArray notes:

    All values are specified in elements

    So the extent you need to form for both calls needs to have the first dimension in elements (because one of the allocations in the cudaMemcpy3D is an array).

    But you potentially have a different problem in your code, because you are allocating the linear memory source d_volumeMem using cudaMalloc. cudaMemcpy3D expects that linear source memory has been allocated with a compatible pitch. Your code is just using a linear allocation of size

    SIZE_X*SIZE_Y*SIZE_Z*sizeof(float)
    

    Now it might be that the dimensions you have chosen produces a compatible pitch for the hardware you are using, but it is not guaranteed that it will do so. I would recommend using cudaMalloc3D to allocate the linear source memory as well. An expanded demonstration of this built around your little code snippet might look like this:

    #include <cstdio>
    
    typedef float  VolumeType;
    
    const size_t SIZE_X = 8;
    const size_t SIZE_Y = 8;
    const size_t SIZE_Z = 8;
    const size_t width = sizeof(VolumeType) * SIZE_X;
    
    texture<VolumeType, cudaTextureType3D, cudaReadModeElementType> tex; 
    
    __global__ void testKernel(VolumeType * output, int dimx, int dimy, int dimz)
    {
        int tidx = threadIdx.x + blockIdx.x * blockDim.x;
        int tidy = threadIdx.y + blockIdx.y * blockDim.y;
        int tidz = threadIdx.z + blockIdx.z * blockDim.z;
    
        float x = float(tidx)+0.5f;
        float y = float(tidy)+0.5f;
        float z = float(tidz)+0.5f;
    
        size_t oidx = tidx + tidy*dimx + tidz*dimx*dimy;
        output[oidx] = tex3D(tex, x, y, z);
    }
    
    inline void gpuAssert(cudaError_t code, char *file, int line, bool abort=true)
    {
       if (code != cudaSuccess) 
       {
          fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
          if (abort) exit(code);
       }
    }
    
    #define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
    
    template<typename T>
    void init(char * devPtr, size_t pitch, int width, int height, int depth)
    {
        size_t slicePitch = pitch * height;
        int v = 0;
        for (int z = 0; z < depth; ++z) {
            char * slice = devPtr + z * slicePitch;
            for (int y = 0; y < height; ++y) {
                T * row = (T *)(slice + y * pitch);
                for (int x = 0; x < width; ++x) {
                    row[x] = T(v++);
                }
            }
        }
    }
    
    int main(void)
    {
        VolumeType *h_volumeMem, *d_output, *h_output;
    
        cudaExtent volumeSizeBytes = make_cudaExtent(width, SIZE_Y, SIZE_Z);
        cudaPitchedPtr d_volumeMem; 
        gpuErrchk(cudaMalloc3D(&d_volumeMem, volumeSizeBytes));
    
        size_t size = d_volumeMem.pitch * SIZE_Y * SIZE_Z;
        h_volumeMem = (VolumeType *)malloc(size);
        init<VolumeType>((char *)h_volumeMem, d_volumeMem.pitch, SIZE_X, SIZE_Y, SIZE_Z);
        gpuErrchk(cudaMemcpy(d_volumeMem.ptr, h_volumeMem, size, cudaMemcpyHostToDevice));
    
        cudaArray * d_volumeArray;
        cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<VolumeType>();
        cudaExtent volumeSize = make_cudaExtent(SIZE_X, SIZE_Y, SIZE_Z);
        gpuErrchk( cudaMalloc3DArray(&d_volumeArray, &channelDesc, volumeSize) ); 
    
        cudaMemcpy3DParms copyParams = {0};
        copyParams.srcPtr = d_volumeMem;
        copyParams.dstArray = d_volumeArray;
        copyParams.extent = volumeSize;
        copyParams.kind = cudaMemcpyDeviceToDevice;
        gpuErrchk( cudaMemcpy3D(&copyParams) ); 
    
        tex.normalized = false;                      
        tex.filterMode = cudaFilterModeLinear;      
        tex.addressMode[0] = cudaAddressModeWrap;   
        tex.addressMode[1] = cudaAddressModeWrap;
        tex.addressMode[2] = cudaAddressModeWrap;
        gpuErrchk(cudaBindTextureToArray(tex, d_volumeArray, channelDesc));
    
        size_t osize = 64 * sizeof(VolumeType);
        gpuErrchk(cudaMalloc((void**)&d_output, osize));
    
        testKernel<<<1,dim3(4,4,4)>>>(d_output,4,4,4);
        gpuErrchk(cudaPeekAtLastError());
    
        h_output = (VolumeType *)malloc(osize);
        gpuErrchk(cudaMemcpy(h_output, d_output, osize, cudaMemcpyDeviceToHost));
    
        for(int i=0; i<64; i++)
            fprintf(stdout, "%d %f\n", i, h_output[i]);
    
        return 0;
    }
    

    You can confirm for yourself that the output of the textures reads matches the original source memory on the host.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have some Thread count pCount and I have some float[] array. I want
hi i have a cuda program which run successfully here is code for cuda
I have a float array representing a grayscale picture that I want to convert
I have a float array in java and want to convert each element to
I have two dimensional float array as below {0.2,0.0,0.3,0.0,0.0} {0.4,0.1,0.0,0.0,0.9} {0.0,0.0,0.0,0.3,0.6} I want to
I have an array of float values and want the value and more importantly
Hi i have float array in server side i want to take this array
I have a float array that I get from a sensor and want to
I have 1000 float datas in an array. I want to separate into different
I have a float number i want that it should display only two digits

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.