I have a very hard to reproduce condition where a perl process gets hung. I am not sure where it is hung. ps ax | grep <process name> shows stat column as SN which I understand means it is sleeping and is running at nice priority.
I looked over the script (there is a ton of code in there) but cannot see any particular sleep that lasts over more than a few seconds (this process has been sleeping for more than a day).
I can’t restart and add log to the Perl scripts because the condition may not be reproduced. I can try strace but wondering if there is a better mechanism
One possible way is to use
gdb.First of all, you need debugging symbols for your perl interpreter. For example, on my Debian system I had to install the
perl-debugpackage for this. After the installation we have/usr/lib/debug/usr/bin/perl, we will pass this to gdb later. Notice that the original, stuck Perl script was started using/usr/bin/perl, not the newly installed debugging version.For the sake of this example, let’s run this Perl script:
When we run it, we get an output like:
Now let’s start up gdb. Use the pid printed by test.pl running right now. We get a prompt after some initial info (“Reading symbols from …”):
In the meanwhile, due to attaching gdb to the perl interpreter, perl gets stopped:
Now, let’s get back to gdb for a backtrace:
As it was likely, perl happened to be stopped in the middle of a sleep(). But which one?
Now we need to figure where to look for perl’s internal info on the currently executing (Perl) source file and line. Originally I found some relevant info in the doumentation of mod_perl. Look for the
curinfomacro in there.As we can see we are on line 9 in test.pl – as expected based on the script’s output.
The linked documentation mentions a few differences about threaded/non-threaded perl binaries (the example above is for threaded perl, v5.14.2). It also looks a bit outdated, because it talks about
my_perl->Tcurcop, while I found what I was looking for under the name ofmy_perl->Icurcop. At the moment, I’m not familiar enough with the internals of perl to tell why this got renamed.I hope this helps.