I have a process and want to start it again when it is killed. To achieve this, I start child “guardian” process, that uses prctl(PR_SET_PDEATHSIG, SIGHUP); to catch killing of its parent and starts it again.
Here is code of guardian (logging omitted):
void restart (int signal) {
if (getppid() == 1) {
if (fork() == 0) {
execl("./process", 0);
}
exit(1);
}
}
int main() {
prctl(PR_SET_PDEATHSIG, SIGHUP, NULL, NULL, NULL);
struct sigaction new_action, old_action;
new_action.sa_handler = restart;
sigemptyset (&new_action.sa_mask);
new_action.sa_flags = 0;
sigaction (SIGHUP, NULL, &old_action);
if (old_action.sa_handler != SIG_IGN) {
sigaction (SIGHUP, &new_action, NULL);
}
while (getppid() != 1) {
sleep(86400000);
}
return 0;
}
And parent:
int main() {
if (fork() == 0) {
execl("./guardian", 0);
}
while (1) {
cout << "I am process\n";
sleep(1);
}
return 0;
}
The problem I have is that it works just one time. Here is ps output when process was started first time:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
1012 13058 0.0 0.3 20244 1932 pts/1 Ss 08:22 0:00 -sh
1012 22084 0.0 0.1 11484 1004 pts/1 S+ 11:20 0:00 \_ ./process
1012 22085 0.0 0.1 11484 1000 pts/1 S+ 11:20 0:00 \_ [guardian]
1012 12510 0.0 0.3 20784 1712 pts/0 Ss 08:14 0:00 -sh
1012 22088 0.0 0.1 17412 1012 pts/0 R+ 11:20 0:00 \_ ps fu
which looks good. Next I kill process with kill -9 22084. And again ps output:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
1012 13058 0.0 0.3 20244 1932 pts/1 Ss+ 08:22 0:00 -sh
1012 12510 0.0 0.3 20784 1712 pts/0 Ss 08:14 0:00 -sh
1012 22091 0.0 0.1 17412 1012 pts/0 R+ 11:21 0:00 \_ ps fu
1012 22089 0.0 0.1 11484 996 pts/1 S 11:20 0:00 [process]
1012 22090 0.0 0.1 11484 996 pts/1 S 11:20 0:00 \_ [guardian]
and when I kill process again kill -9 22089 guardian does not seem to get SIGHUP callback (I checked from logs, they are omitted here).
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
1012 13058 0.0 0.3 20244 1932 pts/1 Ss+ 08:22 0:00 -sh
1012 12510 0.0 0.3 20784 1712 pts/0 Rs 08:14 0:00 -sh
1012 22339 0.0 0.1 17412 1008 pts/0 R+ 11:27 0:00 \_ ps fu
1012 22090 0.0 0.1 11484 996 pts/1 S 11:20 0:00 [guardian]
My question is – why guardian does not get SIGHUP?
I suspect it might have something to to with background process group – when process is restarted it is in background group (compare S+ and S in ps stat).
It looks like
SIGHUPis blocked while you are in the signal handler handlingSIGHUP.fork()andexec()inherit the signal mask, hence your second guardian never receives it again.Unblock
SIGHUPwhile in the signal handler afterfork()beforeexec()parent.