To what extent can a JIT replace platform independent code with processor-specific machine instructions?
For example, the x86 instruction set includes the BSWAP instruction to reverse a 32-bit integer’s byte order. In Java the Integer.reverseBytes() method is implemented using multiple bitwise masks and shifts, even though in x86 native code it could be implemented in a single instruction using BSWAP. Are JITs (or static compilers for that matter) able to make the change automatically or is it too complex or not worth it due to a poor speed/time tradeoff?
(I know that this is in most cases a micro-optimisation, but I’m interested none the less.)
For this case, yes, the hotspot server compiler could do this optimization. The reverseBytes() methods are registered as vmIntrinsics in hotspot. When jit compiler compile these methods, it will generate a special IR node, not compile the whole method. And this node will be translated into ‘bswap’ in x86. see src/share/vm/opto/library_call.cpp
and src/cpu/x86/vm/x86_64.ad