The OpenGL specification lies (or is this a bug?)… Referring to the layout for std140, with shared uniform buffers, it states:
“The set of rules shown in Tabl e L-1 are used by the GLSL compiler to
layout members in a std140-qualified uniform block. The offsets of
members in the block are accumulated based on the sizes of the
previous members in the block (those declared before the variable in
question), and the starting offset. The starting offset of the first
member is always zero.Scalar variable type (bool, int, uint, float) – Size of the scalar in
basic machine types”
(http://www.opengl-redbook.com/appendices/AppL.pdf)
So, armed with this information, I setup a uniform block in my shader that looks something like this:
// Spotlight.
layout (std140) uniform Spotlight
{
float Light_Intensity;
vec4 Light_Ambient;
vec3 Light_Position;
};
… only to discover it doesn’t work with the subsequent std140 layout I setup on the CPU side. That is the first 4 bytes are a float (size of the machine scalar type for GLfloat), the next 16 bytes are a vec4 and the following 12 bytes are a vec3 (with 4 bytes left over on the end to take account of the rule that a vec3 is really a vec4).
When I change the CPU side to specify a float as being the same size as a vec4, i.e. 16 bytes, and do my offsets and buffer size making this assumption, the shader works as intended.
So, either the spec is wrong or I’ve misunderstood the meaning of “scalar” in this context, or ATI have a driver bug. Can anyone shed any light on this mystery?
That PDF you linked to is not the OpenGL specification. I don’t know where you got it from, but that is certainly not the full list of rules. Always check your sources; the spec is not as unreadable as many claim it to be.
Yes, the size of variables of basic types is the same size as the basic machine type (ie: 4 bytes). But size alone does not determine the position of the variable.
Each type has a base alignment, and no matter where that type is found in a uniform block, it’s overall byte offset must fit that alignment. The base alignment of a
vec4is 4 * the alignment of its basic type (ie: float). So the base alignment of avec4is 16.Because
Light_Intensityends after 4 bytes, the compiler must insert 12 bytes of padding, becauseLight_Ambientcannot be on a 4-byte boundary. It must be on a 16-byte boundary, so the compiler uses 12 bytes of empty space.ATI does have a few driver bugs around std140 layout, but this isn’t one of them.
As a general rule, I like to explicitly put padding into my structures, and I avoid
vec3(because it has 16 byte alignment). Doing these generally cuts down on compiler bugs as well as accidental misunderstanding about where things go and how much room they actually take.