Disclaimer: I have looked through this
question and this question
but they both got derailed by small
details and general
optimization-is-unnecessary concerns.
I really need all the performance I
can get in my current app, which is
receiving-processing-spewing MIDI data
in realtime. Also it needs to scale up
as well as possible.
I am comparing array performance on a high number of reads for small lists to ArrayList and also to just having the variables in hand. I’m finding that an array beats ArrayList by a factor of 2.5 and even beats just having the object references.
What I would like to know is:
- Is my benchmark okay? I have switched the order of the tests and number of runs with no change. I’ve also used milliseconds instead of nanoseconds to no avail.
- Should I be specifying any Java options to minimize this difference?
- If this difference is real, in this case shouldn’t I prefer
Test[]toArrayList<Test>in this situation and put in the code necessary to convert them? Obviously I’m reading a lot more than writing.
JVM is Java 1.6.0_17 on OSX and it is definitely running in Hotspot mode.
public class ArraysVsLists {
static int RUNS = 100000;
public static void main(String[] args) {
long t1;
long t2;
Test test1 = new Test();
test1.thing = (int)Math.round(100*Math.random());
Test test2 = new Test();
test2.thing = (int)Math.round(100*Math.random());
t1 = System.nanoTime();
for (int i=0; i<RUNS; i++) {
test1.changeThing(i);
test2.changeThing(i);
}
t2 = System.nanoTime();
System.out.println((t2-t1) + " How long NO collection");
ArrayList<Test> list = new ArrayList<Test>(1);
list.add(test1);
list.add(test2);
// tried this too: helps a tiny tiny bit
list.trimToSize();
t1= System.nanoTime();
for (int i=0; i<RUNS; i++) {
for (Test eachTest : list) {
eachTest.changeThing(i);
}
}
t2 = System.nanoTime();
System.out.println((t2-t1) + " How long collection");
Test[] array = new Test[2];
list.toArray(array);
t1= System.nanoTime();
for (int i=0; i<RUNS; i++) {
for (Test test : array) {
test.changeThing(i);
}
}
t2 = System.nanoTime();
System.out.println((t2-t1) + " How long array ");
}
}
class Test {
int thing;
int thing2;
public void changeThing(int addThis) {
thing2 = addThis + thing;
}
}
Microbenchmarks are very, very hard to get right on a platform like Java. You definitely have to extract the code to be benchmarked into separate methods, run them a few thousand times as warmup and then measure. I’ve done that (code below) and the result is that direct access through references is then three times as fast as through an array, but the collection is still slower by a factor of 2.
These numbers are based on the JVM options
-server -XX:+DoEscapeAnalysis. Without-server, using the collection is drastically slower (but strangely, direct and array access are quite a bit faster, indicating that there is something weird going on).-XX:+DoEscapeAnalysisyields another 30% speedup for the collection, but it’s very much questionabled whether it will work as well for your actual production code.Overall my conclusion would be: forget about microbenchmarks, they can too easily be misleading. Measure as close to production code as you can without having to rewrite your entire application.