Disclaimer
I am well aware that PHP might not have been the best choice in this case for a socket server. Please refrain from suggesting
different languages/platforms – believe me – I’ve heard it from all
directions.
Working in a Unix environment and using PHP 5.2.17, my situation is as follows – I have constructed a socket server in PHP that communicates with flash clients. My first hurtle was that each incoming connection blocked the sequential connections until it had finished being processed. I solved this by utilizing PHP’s pcntl_fork(). I was successfully able to spawn numerous child processes (saving their PID in the parent) that took care of broadcasting messages to the other clients and therefore “releasing” the parent process and allowing it to continue to process the next connection[s].
My main issue right now is dealing/handling with the collection of these dead/zombie child processes and terminating them. I have read (over and over) the relevant PHP manual pages for pcntl_fork() and realize that the parent process is in charge of cleaning up its children. The parent process receives a SIGNAL from its child when the child executes an exit(0). I am able to “catch” that signal using the pcntl_signal() function to setup a signal handler.
My signal_handler looks like this :
declare(ticks = 1);
function sig_handler($signo){
global $forks; // this is an array that holds all the child PID's
foreach($forks AS $key=>$childPid){
echo "has my child {$childPid} gone away?".PHP_EOL;
if (posix_kill($childPid, 9)){
echo "Child {$childPid} has tragically died!".PHP_EOL;
unset($forks[$key]);
}
}
}
I am indeed seeing both echo’s including the relevant and correct child PID that needs to be removed but it seems that
posix_kill($childPid, 9)
Which I understand to be synonymous with kill -9 $childPid is returning TRUE although it is in fact NOT removing the process…
Taken from the man pages of posix_kill :
Returns TRUE on success or FALSE on failure.
I am monitoring the child processes with the ps command. They appear like this on the system :
web5 5296 5234 0 14:51 ? 00:00:00 [php] <defunct>
web5 5321 5234 0 14:51 ? 00:00:00 [php] <defunct>
web5 5466 5234 0 14:52 ? 00:00:00 [php] <defunct>
As you can see all these processes are child processes of the parent which has the PID of 5234
Am I missing something in my understanding? I seem to have managed to get everything to work (and it does) but I am left with countless zombie processes on the system!
My plans for a zombie apocalypse are rock solid –
but what on earth can I do when even sudo kill -9 does not kill the zombie child processes?
Update 10 Days later
I’ve answered this question myself after some additional research, if you are still able to stand my ramblings proceed at will.
Alright… so here we are, 10 days later and I believe that I have solved this issue. I didn’t want to add onto an already longish post so I’ll include in this answer some of the things that I tried.
Taking @sym’s advice, and reading more into the documentation and the comments on the documentation, the
pcntl_waitpid()description states :So I setup my
pcntl_signal()handler like this –For completion, I’ll include the actual code I’m using for forking a child process –
Yea… That’s a ratio of 1:1 comments to code 😛
So this was looking great and I saw the echo of :
However when the socket server loop did it’s next iteration, the
socket_select()function failed throwing this error :The server would now hang and not respond to any requests other than manual kill commands from a root terminal.
I’m not going to get into why this was happening or what I did after that to debug it… lets just say it was a frustrating week…
much coffee, sore eyes and 10 days later…
Drum roll please
TL&DR – The Solution :
Mentioned here in a comment from 2007 in the php sockets documentation and in this tutorial on stuporglue (search for "good parenting"), one can simply "ignore" signals comming in from the child processes (
SIGCHLD) by passingSIG_IGNto thepcntl_signal()function –Quoting from that linked blog post :
Believe it or not – I included that
pcntl_signal()line, deleted all the other handlers and things dealing with the children and it worked! There were no more<defunct>processes left hanging around!In my case, it really did not interest me to know exactly when a child process died, or who it was, I wasn’t interested in them at all – just that they didn’t hang around and crash my entire server 😛