Is it possible to access the DX11 backbuffer directly through CUDA? So instead of copying the data back to the CPU and then sending back to the device to render as a texture, I could just access the buffer directly in my kernel?
This seems to be possible in DirectCompute so I wonder if I can do the same
Yes, it is. Look at DirectX Interoperability in the CUDA Programming Guide.