Say i have two fixed length arrays of unsigned integers.
How do i element wise sum those arrays (into first) without looping or with a lesser number loops?
uint64_t foo[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
uint64_t bar[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
... // funky code without loop so that
// foo now is {0, 2, 4, 6, 8, 10, 12, 14, 16, 18}
The related question: is it possible to sum multiple uint64_t integers in one operation?. (i bet this could be done with sse)
The question in general is: what is the fastest way to sum two fixed length arrays of integer type (in place into first one)?
You can use
_mm_add_epi64to add two 64 bit ints per iteration. I wouldn’t expect a dramatic improvement over straight scalar code though. If you don’t want an explicit loop then you can just unroll this into 5 operations.