Please compare two ways of setting/returning an array:
static public float[] test_arr_speeds_1( int a ) {
return new float[]{ a, a + 1, a + 2, a + 3, a + 4, a + 5,
a + 6, a + 7, a + 8, a + 9 };
} // or e.g. field = new float... in method
static public float[] test_arr_speeds_2( int a ) {
float[] ret = new float[10];
ret[0] = a;
ret[1] = a + 1;
ret[2] = a + 2;
ret[3] = a + 3;
ret[4] = a + 4;
ret[5] = a + 5;
ret[6] = a + 6;
ret[7] = a + 7;
ret[8] = a + 8;
ret[9] = a + 9;
return ret;
} // or e.g. field[0] = ... in method
Both generate distinct bytecodes and both can be decompiled to their former state. After checking the execution times via profiler (100M iterations, unbiased, different environs), the time of _1 method is approx. 4/3 the time of _2, even though both create a new array and both set every field to a given value. The times are negligible most of the time, but this still bugs me – why is _1 visibly slower? Can anybody check/confirm/explain it to me in a reasonable, JVM-supported way?
Here is the difference between bytecode (only for first two items). First method:
Second method:
As you can see the only difference is that the array reference is kept on operand stack in the first scenario (that’s why
dupappears so many times – to avoid loosing a reference to an array afterfastore) while in the second scenario the array reference is kept on normal stack (where method arguments and local variables are kept). In this scenario the reference must be read all the time (aload_1) becausefastorerequires arrayref to be on on the operand stack.We shouldn’t make assumptions based on this bytecode – after all it is translated to CPU instructions by jit and most likely in both cases array reference is stored in one of the CPU registers. Otherwise the performance difference would be huge.
If you can measure the difference and you are doing so low-level optimizations – pick the version that is faster. But I doubt the difference is “portable” (depending on the architecture and JVM version/implementation you will observer different timing behaviour). That being said – I would go for more readable version, rather than the one that happens to be faster on your computer.