I know OpenCL and CUDA. These are not support in mobile device. But most of them support OpenGL ES. So I want to learn using OpenGL ES shading language for general computing. Like OpenCL or CUDA, in OpenGLSL.
- How many kinds of buffer can I use? what are they?
- How to manipulate these buffers
As I know, I can create vertex and fragment shader so far.
- Which buffer can I manipulate when I use fragment shader
- Which buffer can I manipulate when I use vertex shader
- Are there any synchronized function in GPU(I mean synchronization in GPU. like the thread synchronized in the block in OpenCL or CUDA)
PS:
I read a paper Using Mobile GPU for General-Purpose Computing. Their experiments were performed on an Nvidia Tegra SoC with the following specifications:
- 1GHz dual-core ARM Cortex-A9 CPU,
- 1GB of RAM
- an Nvidia ultra-low-power GeForce GPU running at 333MHz, and 512MB of Flash memory
It can get 3X speedup on FFT(128*128). I think these result is good. Do you guys think if it’s worth to do it. So the main bottleneck is the memory access right?
As many guys said it’s not worth to do general purpose computing on OpenGL ES. So it’s not worth to expect the mobile supporting OpenCL either. Right? In my opinion, OpenGL ES is the fomentation of the OpenCL
Some platforms don’t support any floating point formats. Some platforms (powervr, tegra, adreno) support half-float (16bit float) surfaces, which can be used both as a render target and as a texture. Full float support exists on some platforms (adreno, and I believe the latest powervr), but is rather rare.
So it depends a lot on what kind of calculation you’re expecting to do, what kind of precision is acceptable for you, as well as what your target platform is.
Also take into account the fact that current-gen opengl es (2.0) does not have full IEEE float requirements, so the results may vary.
In the end, whether it is worth it depends a lot on your batch sizes though; accessing the results (i.e, reading pixels back from the render target) may be so slow that it negates the performance gain.
To address your bullet-points one by one:
You can create a texture and form a FBO out of that. Additionally you can feed data to the shaders as constants (uniforms) or per-vertex data streams (varyings/attribs).
You can write to a texture using the normal texture handling functions.
When a FBO is bound, you can write to it using the fragment shader. Later, you can access the results by reading from the texture you bound to the FBO.
None.
You can flush the pipeline using glFinish(). The drivers should cause a pipeline flush implicitly if you try to access the texture data, though.