I am using Tesla C2050, which has a compute capability 2.0 and has 48KB

Question

0

Asked: May 28, 20262026-05-28T19:14:25+00:00 2026-05-28T19:14:25+00:00

I am using Tesla C2050, which has a compute capability 2.0 and has 48KB

0

I am using Tesla C2050, which has a compute capability 2.0 and has 48KB shared memory . But when I try to use this shared memory the nvcc compiler gives me the following error

Entry function '_Z4SAT3PhPdii' uses too much shared data (0x8020 bytes + 0x10 bytes system, 0x4000 max)

SAT1 is the naive implementation of a scan algorithm, and because I am operating on images sizes of the order 4096x2160 I have to use double to calculate the cumulative sum. Though Tesla C2050 does not support double, but it nevertheless does the task by demoting it to float. But for an image width of 4096 the shared memory size comes out to be greater 16KB but it is well within the 48KB limit.

Can anybody help me understand what is happening here. I am using CUDA Toolkit 3.0.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T19:14:26+00:00

By default, Fermi cards run in a compatibility mode, with 16kb shared memory and 48kb L1 cache per multiprocessor. The API call cudaThreadSetCacheConfig can be used to change the GPU to run with 48kb shared memory and 16kb L1 cache, if you require it. You then must compile the code for compute capability 2.0 to avoid the code generation error you are seeing.

Also, your Telsa C2050 does support double precision. If you are getting compiler warnings about demoting doubles, it means you are not compiling your code for the correct architecture. Add

--arch=sm_20

to your nvcc arguments and the GPU toolchain will compile for your Fermi card, and will include double precision support and other Fermi specific hardware features, including larger shared memory size.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am using Tesla C2050, which has a compute capability 2.0 and has 48KB

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply