I have the following two programs:
long startTime = System.currentTimeMillis();
for (int i = 0; i < N; i++);
long endTime = System.currentTimeMillis();
System.out.println("Elapsed time: " + (endTime - startTime) + " msecs");
and
long startTime = System.currentTimeMillis();
for (long i = 0; i < N; i++);
long endTime = System.currentTimeMillis();
System.out.println("Elapsed time: " + (endTime - startTime) + " msecs");
Note: the only difference is the type of the loop variable (int and long).
When I run this, the first program consistently prints between 0 and 16 msecs, regardless of the value of N. The second takes a lot longer. For N == Integer.MAX_VALUE, it runs in about 1800 msecs on my machine. The run time appears to be more or less linear in N.
So why is this?
I suppose the JIT-compiler optimizes the int loop to death. And for good reason, because obviously it doesn’t do anything. But why doesn’t it do so for the long loop as well?
A colleague thought we might be measuring the JIT compiler doing its work in the long loop, but since the run time seems to be linear in N, this probably isn’t the case.
I’m using JDK 1.6.0 update 17:
C:\>java -version
java version "1.6.0_17"
Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)
I’m on Windows XP Professional x64 Edition, Service Pack 2, with an Intel Core2 Quad CPU at 2.40GHz.
DISCLAIMER
I know that microbenchmarks aren’t useful in production. I also know that System.currentTimeMillis() isn’t as accurate as its name suggests. This is just something I noticed while fooling around, and I was simply curious as to why this happens; nothing more.
It’s an interesting question, but to be honest I’m not convinced that considering Hotspot’s behaviour here will yield useful information. Any answers you do get are not going to be transferable in a general case (because we’re looking at the optimisations that Hotspot performs in one specific situation), so they’ll help you understand why one no-op is faster than another, but they won’t help you write faster “real” programs.
It’s also incredibly easy to write very misleading micro benchmarks around this sort of thing – see this IBM DW article for some of the common pitfalls, how to avoid them and some general commentary on what you’re doing.
So really this is a “no comment” answer, but I think that’s the only valid response. A compile-time-trivial no-op loop doesn’t need to be fast, so the compiler isn’t optimised to be fast in some of these conditions.