I have a long-running process where, due to a bug, a trivial/expendable thread is deadlocked with a thread which I would like to continue, so that it can perform some final reporting that would be hard to reproduce in another way.
Of course, fixing the bug for future runs is the proper ultimate resolution. Of course, any such forced interrupt/kill/stop of any thread is inherently unsafe and likely to cause other unpredictable inconsistencies. (I’m familiar with all the standard warnings and the reasons for them.)
But still, since the only alternative is to kill the JVM process and go through a more lengthy procedure which would result in a less-complete final report, messy/deprecated/dangerous/risky/one-time techniques are exactly what I’d like to try.
The JVM is Sun’s 1.6.0_16 64-bit on Ubuntu, and the expendable thread is waiting-to-lock an object monitor.
Can an OS signal directed to an exact thread create an InterruptedException in the expendable thread?
Could attaching with gdb, and directly tampering with JVM data or calling JVM procedures allow a forced-release of the object monitor held by the expendable thread?
Would a Thread.interrupt() from another thread generate a InterruptedException from the waiting-to-lock frame? (With some effort, I can inject an arbitrary beanshell script into the running system.)
Can the deprecated Thread.stop() be sent via JMX or any other remote-injection method?
Any ideas appreciated, the more ‘dangerous’, the better! And, if your suggestion has worked in personal experience in a similar situation, the best!
No.
In theory Yes. In practice, you would need to a deep understanding of the internals of the JVM to have any chance of succeeding. So, realistically No.
In theory Yes. In practice the beanshell script would need to find the
Threadobject for the thread to be interrupted. That may involve traversing the tree ofThreadGroupobjects, etc. Another issue is whether the interrupted thread is going to behave properly. For example, a lot of folks write their wait/notify code to catch / ignoreInterruptedExceptionand retry. If you’ve done that, the interrupt probably won’t do any good.If you can call
Thread.interrupt()you can use the same approach to callThread.stop(). Normally, I’d say don’t do it. But in this situation it might be worth a try.But the real lesson from all of this is that an application that can take days or weeks to produce an answer ought to implement a checkpoint / resume mechanism to deal with this kind of eventuality, and things like power failure, hardware failure, machine reboot, etc.