I’m working on an industrial project dealing with large image processing (50Mo per image) and the key is performance.
I made the choice to delegate image processing to the GPU with the help of JavaCL. I write some tests in order to determine if the approach is correct. The results are clear !
Over 100 runs of image coloring GPU wins :
GPU=172ms vs. CPU=438ms
For now, it is clear that GPU is more powerful than CPU for this kind of computation BUT! there is a problem, a problem of memory. In fact, my graphic card has 256Mo of VRam and cannot allocate an image larger than 8Mo !
So, my question is, what is the best way for processing images larger than 8Mo ?
- Tile the image and process each tile ? Will be performance killer
due to the latency between RAM and VRAM - Extract raw pixels as float4 vectors and send them to the GPU ?
- Change my graphic card ?
- Throw the project ?
- Drink more coffee ?
Thanks to all in advance 🙂
I am not familiar with the JavaCL bindings – but in OpenCL, there are images and then there are buffers.
You can allocate buffers as large as possible, but there are limitations on the size of a cl_mem created using clCreateImage2D (CL_DEVICE_IMAGE2D_MAX_WIDTH and CL_DEVICE_IMAGE2D_MAX_HEIGHT). An image has some advantages over a raw buffer, like providing hardware accelerated sampling. If you don’t need sampling or can implement your own sampling inside the kernel – then it might be possible to use a buffer. Otherwise you will have to tile your input image and resolve any filtering artifacts that tile-processing will introduce.
Hope this helps!