I am reading the NVIDIA white paper titled Particle Simulation with CUDA by Simon

Question

0

Editorial Team

Asked: May 28, 20262026-05-28T03:15:00+00:00 2026-05-28T03:15:00+00:00

I am reading the NVIDIA white paper titled Particle Simulation with CUDA by Simon

0

I am reading the NVIDIA white paper titled Particle Simulation with CUDA by Simon Green.

It describes the SDK particles example and the algorithms used.

While discussing performance of the code, the author says that global memory arrays of position and velocity of the particles are “bound” to textures.

Now I am very confused by the concept of texture memory. The NVIDIA CUDA programming guide goes through some really gory and difficult explanations without any examples.

Hence I have 2 questions:

Can someone give / refer me to a really simple (Texture memory for dummies) example of how texture is used and improves performance.
The CUDA programming guide 4.0 on page 40 on page says “A texture can be any region of linear memory or a CUDA array”. Now if, ( as is said ), texture memory gives better performance than global memory why not “bind” the entire global memory to texture memory?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T03:15:00+00:00

The cuda SDK contains a straightforward example simpleTexture which demonstrates performing a trivial 2D coordinate transformation using a texture.
The first thing to keep in mind is that texture memory is global memory. The only difference is that textures are accessed through a dedicated read-only cache, and that the cache includes hardware filtering which can perform linear floating point interpolation as part of the read process. The cache, however, is different to a conventional cache, in that it is optimised for spatial locality (in the coordinate system of the texture) and not locality in memory. For some applications, this is ideal and will give a performance advantage both because of the cache and the free FLOPs you can get from the filtering hardware, but for others, it won’t and textures can be slower because access involves a cache miss penalty in addition to the global memory read, and interpolation is not required.

So something like particle simulation can benefit from textures because calculations are generally performed in cells or control volumes where local interactions are considered, and neighbour particles need to access each others velocities and accelerations. A spatially local cache works better for this than a simple linear memory cache. But for other applications, there isn’t intrinsic spatial locality in memory access patterns, and textures provide little or no benefit over conventional cached memory.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am reading the NVIDIA white paper titled Particle Simulation with CUDA by Simon

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply