We have a pool of server that sits behind the load balancer. The machines in this pool does garbage collection every 6 seconds on average. It takes almost half a second to garbage collect. We also see a CPU spike during garbage collection.
The client machines see a spike in average time to make a connection to the server almost 10% during a day.
Theory : CPU is busy doing GC and that’s why it cannot allocate a connection faster.
Is it a valid theory?
JVM : IBM
GC algorithm :gencon
Nursery : 5 GB
Heap Size : 18 GB
I’d say with that many allocations all bets are off–it could absolutely get worse over time, I mean if you are doing GC every 6 seconds all day long that seems problematic.
Do you have access to that code? Can it be re-written to reuse objects and be more intelligent about allocation? I’ve done a few embedded systems and the trick is to NEVER call new once the system is up and running (Quite doable if you have control over the entire system)
If you don’t have access to the code, check into some of the GC tuning options available (including the selection of the garbage collector used)–both distributed with the JDK and 3rd party options. You may be able to improve performance with a few command-line modifications.