Given the arrays:
int canvas[10][10];
int addon[10][10];
Where all the values range from 0 – 100, what is the fastest way in C++ to add those two arrays so each cell in canvas equals itself plus the corresponding cell value in addon?
IE, I want to achieve something like:
canvas += another;
So if canvas[0][0] =3 and addon[0][0] = 2 then canvas[0][0] = 5
Speed is essential here as I am writing a very simple program to brute force a knapsack type problem and there will be tens of millions of combinations.
And as a small extra question (thanks if you can help!) what would be the fastest way of checking if any of the values in canvas exceed 100? Loops are slow!
Here is an SSE4 implementation that should perform pretty well on Nehalem (Core i7):
Compile with
gcc -msse4.1 ...or equivalent for your particular development environment.For older CPUs without SSE4 (and with much more expensive misaligned loads/stores) you’ll need to (a) use a suitable combination of SSE2/SSE3 intrinsics to replace the SSE4 operations (marked with an
*above) and ideally (b) make sure your data is 16-byte aligned and use aligned loads/stores (_mm_load_si128/_mm_store_si128) in place of_mm_loadu_si128/_mm_storeu_si128.