I have an image represented by a 2D array of chars, I need to perform some operations on this image and store the result in another 2D array, these operations vary from calculating average value of neighbour cells to reordering rows.
what are possible optimisations I can do to get better performance?
any possible techniques are welcome (e.g locality of references, inline assembly, …)
I use c on a linux x86_64 machine
PS: I have raw color images with each pixel represented by a group of RGB values.
plan your API as
void process_row(int *out_row, int *in_row, int *row_above, int *row_below);some RGB calculations can be calculated in parallel with normal integer arithmetic.
One can add 4 of those without overflow, then simply shift the result right by 2 bits
and mask with 0011111111 0011111111 0011111111 b
(this of course requires pre- and postprocessing the data)
00 rr 00 gg 00 bb 00 aais also feasible approach