I’m experiencing a very odd problem with a Java application running under Tomcat.
We tried to update the production code from a fresh newly produced in a 1-week sprint, the application has been running over months without hiccups and then this new code makes our Linux servers start swapping after some time.
The very strange thing is that when looking at VisualVM for memory usage it never exceeds the maximum heap size, the JVM does not throw an OutOfMemory, the machine only starts swapping and the JVM keeps running even after that.
So, it seems that’s leaking memory from somewhere, it seems like it’s from the new code but it’s odd that it’s not inside the JVM, any ideas in how to debug that?
Thanks!
Swapping is not a conclusive indicator of leakage. It results from low physical memory. Use vmstat on Linux to get swap usage. Try using a different machine, experiment with configurations –swap size, physical memory size, address space.
If you are confident that the problem is in your program try this:
Estimate the median and peak memory that your program should use. You must be able to account for all deviations from these metrics. If you cannot, proceed to step 3.
Assuming you did step 1 correctly and were able to account for all deviations, you can rule out the leak (sorry about such vague suggestions but debugging is only as good as the detective). You should now focus on GC tuning. First, enable GC logging. See if your heap is actually full and where the GC is spending most of its time collecting. This may be a good starting point to start optimizations. Try to see if adjusting GC options helps. Try experimenting with collection algorithms, max/min heap sizes, gen ratios etc. Only experiment when you have ruled out a leak (step 1).
Assuming you did step 1 correctly and were not able to account for all deviations, you can assume that you have a leak somwhere. Use a memory profiler to see what objects contribute to the heap size growth most. Leave a profiler running for an extended period of time –have your program handle some requests it routinely expects to get and then leave it relatively isolated after that. If the memory level keeps on growing you may have a leak somewhere. If not, then it is probably not a memory leak. Can you pin point the part of your program that may be creating them? If yes, try sending several requests that only target that part of your program. Does it replicate the problem deterministically? If no, repeat step 3. If yes, use divide and conquer and reapply step 3 till you can find the class/method that are the culprits. It can be a certain combination of multiple portions as well (meaning that individually they may look innocent but together they may form a brilliant crime syndicate).
Hope this helps, if not then please leave a comment to my post.
All the very best on your exercise!