I have a C++ library that I call from Java via JNI. There’s a bug in the C++ code that occasionally crashes the JVM. To be robust to such crashes, I have wrapped the Java program in a shell script that relaunches java when it exits. Most of the time this works, but once in a while the JVM crashes (prints a native stack trace to stderr, can no longer be attached to by a java debugger, stops consuming any appreciable amount of CPU time) but doesn’t exit, so it doesn’t get relaunched until I kill it by hand. Why might this happen and what can I do to prevent it?
I’m running under linux. After the crash, the JVM doesn’t respond to SIGTERM, only to SIGKILL. When I attach to the JVM process with a native debugger, I see that the threads are all blocked in __kernel_vsyscall.
FWIW I eventually traced these deadlocks to a bug in glibc’s malloc. It’s been known for years and apparently there are no plans to fix it. 🙁