I want to use a sub-sampled depth buffer to increase performance of a program. In my case, it does not matter if there are artifacts or geometry popping will occur.
I have set up my framebuffer like this:
// Color attachment
glBindTexture(GL_TEXTURE_2D, colorAttachment);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 640, 360, 0, GL_RGBA, GL_UNSIGNED_BYTE, nil);
// Depth attachment
glBindRenderbuffer(GL_RENDERBUFFER, depthAttachment);
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT16, 160, 90);
// Framebuffer
glBindFramebuffer(GL_FRAMEBUFFER, framebuffer);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, colorAttachment, 0);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, depthAttachment);
However, now, glCheckFramebufferStatus(GL_FRAMEBUFFER) returns GL_FRAMEBUFFER_INCOMPLETE_DIMENSIONS which stands for “Not all attached images have the same width and height” according to the documentation.
There exists a research paper called “Full-3D Edge Tracking with a Particle Filter” which describes in section 3.5 that they actually used a sub-sampled depth buffer to increase the performance in their application.
Sub-sampled depth buffer: Adjacent pixels along an image edge are so closely correlated
that testing each individual edge pixel is redundant. For single-hypothesis trackers,
it is common to spread sample points a distance of 10-20 pixels apart along an edge.
Sampling only every nth edge pixel also reduces the graphics bandwidth required and
so only every 4th pixel is sampled. Instead of explicitly drawing stippled lines, this is here achieved by using a sub-sampled depth buffer (160 x 120) since this further achieves
a bandwidth reduction for clearing and populating the depth buffer. However, this also
means that hidden line removal can be inaccurate to approximately four pixels. Apart
from this, the accuracy of the system is unaffected.
The only workarounds which are obvious are
- Using a fragment shader program to perform the lookup into the previously rendered depth buffer to apply the depth-check manually.
- Rendering the depthbuffer in the lower resolution, then resample it to the bigger resolution, then use it as before.
Both approaches don’t sound like they would be the most performant ideas. What is the cleanest way to achieve a sub-sampled depth buffer?
The doc page you referenced refers to OpenGL ES 1.0 and 2.0. The OpenGL wiki has more information as to the difference between 2.0 and 3.0, namely that starting with 3.0 (and ARB_framebuffer_object), framebuffer textures can be of different sizes. However, if I recall correctly, when you have textures of different sizes attached, the actual texture size used is the intersection of all FBO attached textures. I don’t think this is what you want.
In order to reduce the size of your depth texture, I suggest using glBlitFramebuffer to transform your large texture into a smaller one. This operation is completely done on the GPU so it’s very fast. The final smaller texture can then be used as input for further rendering operations in your shaders which will definitely provide bandwidth savings. Instead of performing the averaging of multiple depth values for each pixel shader execution, it will be done once per texel in the smaller texture. A smaller texture is also inherently faster to sample since it fits in cache better.
Keep in ming however that averaging depth samples can produce wild inaccuracies because the depth values are not linearly spread.