I’m interested to know if any common algorithms (sorting, searching, graphs, etc.) have been ported to OpenCL (or any GPU language), and how the performance compares to the same algorithm executed by the CPU. I’m specifically interested in the results (numbers).
Thanks!
There are quite a few samples of this sort of thing on NVidia’s website. Bear in mind that some things such as sorting need special algorithms for efficient parallelism and may not be quite as efficient as a non-threaded algorithm on a single core.