is there a fast algorithm, similar to power of 2, which can be used with 3, i.e. n%3.
Perhaps something that uses the fact that if sum of digits is divisible by three, then the number is also divisible.
This leads to a next question. What is the fast way to add digits in a number? I.e. 37 -> 3 +7 -> 10
I am looking for something that does not have conditionals as those tend to inhibit vectorization
thanks
4 % 3 == 1, so(4^k * a + b) % 3 == (a + b) % 3. You can use this fact to evaluate x%3 for a 32-bit x:(Untested – you might need a few more reductions.) Is this faster than your hardware can do x%3? If it is, it probably isn’t by much.