Given a long with bytes WXYZ (where each letter is a byte), I would like some fast bit twiddling code that will create two longs with the same bytes as the original, but interleaved with the 0 byte.
For example, given the long with value ABCDEFGH (each letter being one byte), produce the two longs:
0A0B0C0D
0E0F0G0H
Something equivalent to, but faster than:
long result1 = expand((int)(input >>> 32));
long result2 = expand((int)input);
long expand(int inputInt) {
long input = intputInt;
return
(input & 0x000000FF) |
(input & 0x0000FF00) << 8 |
(input & 0x00FF0000) << 16 |
(input & 0xFF000000) << 24;
}
The following is about 25% faster for me (Java 7, benchmarked using Google Caliper), YMMV may vary according to your compiler of course:
The idea is to use a bit of extra parallelism vs. the original approach.
The first line is a neat trick that produces garbage in bits 17-32, but you don’t care as you are going to mask it out anyway. 🙂