Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3791120
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 19, 20262026-05-19T12:25:16+00:00 2026-05-19T12:25:16+00:00

I am trying to figure out how to use a CUDA kernel as part

  • 0

I am trying to figure out how to use a CUDA kernel as part of a library so that I can just add the library to my existing C++ source files, and be able to use the cuda kernel.

So how do you go about doing this? I tried to create a wrapper, like so:

.h file:

#ifndef __reductions2d_H_
#define __reductions2d_H_
#include <stdio.h>
#include <cuda.h>
#include <cuda_runtime.h>
extern "C" void getMean_wrapper();
#endif

.cu

 __global__ void getMean(float *devDataPtr, size_t pitch, int rows, int cols)
       {
          for (int r = 0; r < height; ++r)
          { 
             float* row = (float*)((char*)devPtr + r * pitch);
             for (int c = 0; c < width; ++c)
             {
                printf("Row[%i][%i]: %4.3f \n",r,c row[c]);
             }
          }
       }


   void getMean_wrapper()
   {
   // Host code 
     int width = 3, height = 3;
     int N = width*height;
     float* devData; size_t pitch;
     cudaMallocPitch(&devData, &pitch, width * sizeof(float),height);
     int blockSize = 4;
     int nBlocks = N/blockSize + (N%blockSize == 0?0:1);

     getMean<<<nBlocks, blockSize>>>(devData, pitch, width,height);
   }

main.cpp

#include "reductions2d.h"

int main(void){

getMean_wrapper();
return 0;

}

However, when I compile this with nvcc *.cpp, it tells me it cant find getMean_wrapper(), and when I try to just compile with g++ -c main.cpp, it tells me it cant find cuda.h and cuda_runtime.h

Is the best approach to specify the location of the cuda libraries with my G++ command line, build those objects, build the .cu objects, then link them? Seems like a hassle to have to have a 3 step process to add in some cuda functionality

Thanks

edit:

it seems like when I try to do it individually,t hen link wtih

g++ -o runme *.o -lcuda

I get

$ g++ -o runme *.o -lcuda
reductions2d.o: In function         `__sti____cudaRegisterAll_47_tmpxft_00007643_00000000_4_reductions2d_cpp1_ii_4ef 611a7()':
tmpxft_00007643_00000000-1_reductions2d.cudafe1.cpp:(.text+0x15e): undefined     reference to `__cudaRegisterFatBinary'
tmpxft_00007643_00000000-1_reductions2d.cudafe1.cpp:(.text+0x1b9): undefined reference to `__cudaRegisterFunction'
reductions2d.o: In function `__cudaUnregisterBinaryUtil()':
tmpxft_00007643_00000000-1_reductions2d.cudafe1.cpp:(.text+0x1d8): undefined reference to `__cudaUnregisterFatBinary'
reductions2d.o: In function `__device_stub__Z7getMeanPfmii(float*, unsigned long, int, int)':
tmpxft_00007643_00000000-1_reductions2d.cudafe1.cpp:(.text+0x20d): undefined reference to `cudaSetupArgument'
tmpxft_00007643_00000000-1_reductions2d.cudafe1.cpp:(.text+0x22f): undefined reference to `cudaSetupArgument'
tmpxft_00007643_00000000-1_reductions2d.cudafe1.cpp:(.text+0x251): undefined reference to `cudaSetupArgument'
tmpxft_00007643_00000000-1_reductions2d.cudafe1.cpp:(.text+0x273): undefined reference to `cudaSetupArgument'
reductions2d.o: In function `getMean_wrapper':
tmpxft_00007643_00000000-1_reductions2d.cudafe1.cpp:(.text+0x164c): undefined reference to `cudaConfigureCall'
reductions2d.o: In function `cudaError cudaLaunch<char>(char*)':
tmpxft_00007643_00000000-1_reductions2d.cudafe1.cpp:(.text._Z10cudaLaunchIcE9cudaErrorPT_[cudaError cudaLaunch<char>(char*)]+0x11): undefined reference to `cudaLaunch'
reductions2d.o: In function `cudaError cudaMallocPitch<float>(float**, unsigned long*, unsigned long, unsigned long)':
tmpxft_00007643_00000000-1_reductions2d.cudafe1.cpp:(.text._Z15cudaMallocPitchIfE9cudaErrorPPT_Pmmm[cudaError cudaMallocPitch<float>(float**, unsigned long*, unsigned long, unsigned long)]+0x29): undefined reference to `cudaMallocPitch'

I read that i need to include the cuda runtime libraries, so I did

ldconfig -p | grep cudart and included /usr/local/cuda/lib64 in my LD_LIBRARY_PATH and it still cant find cudart

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-19T12:25:16+00:00Added an answer on May 19, 2026 at 12:25 pm

    include .h into .cu as well. nvcc is c++ compiler and mangles names.

    compile as:

    nvcc -c file.cu // compile cuda kernel
    nvcc file.o main.cpp // compile and link
    

    I would change you code as:

    .hpp

    #ifndef __reductions2d_H_
    #define __reductions2d_H_
    
    void getMean_wrapper(); // c++ linkga
    
    #endif
    

    .cu:

    #include "...hpp"
    #include <cuda.h>
    #include <cuda_runtime.h>
    
    __global__ void getMean(float *devDataPtr, size_t pitch, int rows, int cols)
       {
    

    which then you compile as

    nvcc -c file.cu // compile cuda kernel
    g++ -lcudart file.o main.cpp // no cuda stuff needed save for lib
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I've been having a lot of problems trying to figure out how to use
I'm new to Rails development, and I'm trying to figure out how to use
I'm trying to figure out how to decide when to use NSDictionary or NSCoder/NSCoding?
I am trying to figure out the use of Maven and I got many
Trying to figure out which to use.
Trying to figure out how to use EventToCommand to set a datagrid double click
I am trying to figure out how to use ASP.NET MVC and it seems
I'm trying to figure out how to use NHibernate configuration with mapping to update
I'm trying to figure out how to use Emacs Code Browser (ECB) and one
I'm trying to figure out how to use the java.util.logging features. All of the

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.