I’ve been tuning my game’s renderer for my laptop, which has a Radeon HD 3850. This chip has a decent amount of processing power, but rather limited memory bandwidth, so I’ve been trying to move more shader work into fewer passes.
Previously, I was using a simple multipass model:
-
Bind and clear FP16 blend buffer (with depth buffer)
- Depth-only pass
- For each light, do an additive light pass
-
Bind backbuffer, use blend buffer as a texture
- Tone mapping pass
In an attempt to improve the performance of this method, I wrote a new rendering path that counts the number and type of lights to dynamically build custom GLSL shaders. These shaders accept all light parameters as uniforms and do all lighting in a single pass. I was expecting to run into some kind of limit, so I tested it first with one light. Then three. Then twenty-one, with no errors or artifacts, and with great performance. This leads me to my actual questions:
Is the maximum number of uniforms retrievable?
Is this method viable on older hardware, or are uniforms much more limited?
If I push it too far, at what point will I get an error? Shader compilation? Program linking? Using the program?
Shader uniforms are typically implemented by the hardware as registers (or sometimes by patching the values into shader microcode directly, e.g. nVidia fragment shaders). The limit is therefore highly implementation dependent.
You can retrieve the maximums by querying
GL_MAX_VERTEX_UNIFORM_COMPONENTS_ARBandGL_MAX_FRAGMENT_UNIFORM_COMPONENTS_ARBfor vertex and fragment shaders respectively.