Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8953251
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 15, 20262026-06-15T14:03:34+00:00 2026-06-15T14:03:34+00:00

I am using CUDA 5.0 and a Compute Capability 2.1 card. The question is

  • 0

I am using CUDA 5.0 and a Compute Capability 2.1 card.

The question is quite straightforward: Can a kernel be part of a class?
For example:

class Foo
{
private:
 //...
public:
 __global__ void kernel();
};

__global__ void Foo::kernel()
{
 //implementation here
}

If not then the solution is to make a wrapper function that is member of the class and calls the kernel internally?

And if yes, then will it have access to the private attributes as a normal private function?

(I’m not just trying it and see what happens because my project has several other errors right now and also I think it’s a good reference question. It was difficult for me to find reference for using CUDA with C++. Basic functionality examples can be found but not strategies for structured code.)

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-15T14:03:35+00:00Added an answer on June 15, 2026 at 2:03 pm

    Let me leave cuda dynamic parallelism out of the discussion for the moment (i.e. assume compute capability 3.0 or prior).

    remember __ global__ is used for cuda functions that will (only) be called from the host (but execute on the device). If you instantiate this object on the device, it won’t work. Furthermore, to get device-accessible private data to be available to the member function, the object would have to be instantiated on the device.

    So you could have a kernel invocation (ie. mykernel<<<blocks,threads>>>(...); embedded in a host object member function, but the kernel definition (i.e. the function definition with the __ global__ decorator) would normally precede the object definition in your source code. And as stated already, such a methodology could not be used for an object instantiated on the device. It would also not have access to ordinary private data defined elsewhere in the object. (It may be possible to come up with a scheme for a host-only object that does create device data, using pointers in global memory, that would then be accessible on the device, but such a scheme seems quite convoluted to me at first glance).

    Normally, device-usable member functions would be preceded by the __ device__ decorator. In this case, all the code in the device member function executes from within the thread that called it.

    This question gives an example (in my edited answer) of a C++ object with a member function callable from both the host and the device, with appropriate data copying between host and device objects.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am using tesla k20 with compute capability 35 on Linux with CUDA 5.With
I am writing a CUDA kernel in which I'm using the string data type
using this http://bl.ocks.org/950642 we can see how to add images to nodes, the question
I'm using the CUDA Occupancy calculator to try to optimize my CUDA kernel. Currently
Using different streams for CUDA kernels makes concurrent kernel execution possible. Therefore n kernels
I am using Tesla C2050, which has a compute capability 2.0 and has 48KB
I am using a Tesla C1060 with 1.3 compute capability and nvcc compiler driver
Im looking at this implementation of DCT using cuda: http://www.cse.nd.edu/courses/cse60881/www/source_code/dct8x8/dct8x8_kernel1.cu The part in question
I am writing a massively parallel GPU application using CUDA. I have been optimizing
I'm using the CUDA .rules file which comes with the CUDA SDK for custom

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.