I am testing the performance of a data streaming system that supports continuous queries.
This is how it works:
– There is a polling service which sends data to my system.
– As data passes into the system, each query evaluates based on a window of the stream at the current time.
– The window slides as data passes in.
My problem is this, when I add more queries to the system, I should expect the throughput to decrease because it can’t cope the data rate.
However, I actually observe an increase in throughput.
I can’t understand why this is the case and I am guessing that it’s something to do with the way the JVM allocates CPU, memory etc.
Can anyone shed any light to my problem?
Most Java Virtual Machines initially interpert the JVM bytecode, which is slightly slower than native machine code execution. As the JVM discovers that you are using a particular section of the code repeatedly, it compiles that section of code into native machine code (increasing it’s processing speed). As a result, sometimes stress testing code, or even leaving the code running for longer, tends to speed up execution instead of slowing it down. The HotSpot JVM (the default one from SUN) is the most known JVM which performs native compilation to speed up code execution.
Also, many Java libraries are very mature compared to some libraries you may have encountered in the past. That means that instead of allocating a thread to process a request, they might be using non-blocking listeners on sockets, thread pools of re-assignable worker threads, or any number of techniques suitable for high throughput processing. This coupled with the self tuning of a JIT (HotSpot-like) JVM makes benchmarking Java quite a challenge. Generally speaking, things tend to get faster the longer they run, up to a point.