In CUDA combined __device__ and __host__ allows a function to be call from both the device and the host.
My question is: Is any example that using both will be really preferable that just defining __device__ or __host__?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
The canonical example is using C++ classes in CUDA. In the CUDA C++ model, every member function of a class must be defined in both host and device code if that class is to be instantiated in both the device and host memory spaces.
The simplest possible case would be a trivial class:
It is not possible to use this in class in CUDA, you must define the constructor in both device and host code, so: