How well optimized is Java’s parallel collecting GC for multithreaded environments? I’ve written some multithreaded Jython code that spends most of its time calling Java libraries. Depending on which options I run the program with, the library calls either do tons of allocations under the hood or virtually none. When I use the options that require tons of heap allocations, I can’t get the code to scale past 6 cores. When I use the options that don’t require lots of allocations, it scales to at least 20. How likely is it that this is related to a GC bottleneck, given that I’m using the stock Sun VM, the parallel GC and Jython as my glue language?
Edit: Just to clarify, I won’t necessarily think of stuff that’s obvious to Java veterans because I almost never use Java/JVM languages. I do most of my programming in D and the flagship CPython implementation of Python. I’m using the JVM and Jython for a small one-off project b/c I need access to a Java library.
To me, problems with GC and multithreading are very real. I’m not saying the JVM is bad, it’s just that the problem itself is very hard to deal with.
In one of our project, we had two applications running in a single JVM (app. server). When stressing them individually that was fine, but when both were stress together performance degraded in strange way. We finally split the apps. in two JVMs, and performance went back to normal (of course slower than when only one app was use, but reasonable).
Tuning the GC is extremely hard. Things can improve for 5 minutes, and then a major collection will block, etc. You might decide whether you want high throughput or low latency in the operations. High throughput is fine for batch processing, low latency is necessary for interactive application. Ultimately, the default parameters of the JVM were for us the one giving the best results!
That’s not really an answer, rather a return on experience, but yes, to me GC and multi threading might be an issue.