I have a long-running Python program that starts and stops a Postgres server as part of its operation. I stop the server by using subprocess to spawn pg_ctl -m fast. As a fall-back, I check the return code and, if it failed, I then run pg_ctl -m immediate.
The problem is that sometimes both fail. I haven’t been able to reproduce this myself, but it happens with some frequency for users of my program. I log stdout/stderr from the pg_ctl calls, but don’t get any useful info there. As far as I can tell, either the master process or its children have stopped responding to SIGQUIT, and the only way to terminate them is with SIGKILL, which pg_ctl does not use.
I’ve basically exhausted ideas on the Postgres side. I’m using Postgres 8.3, so I’m sure upgrading to a more recent version would resolve this, but unfortunately that is not an option for me. The only solution I can come up with is to kill the children manually. But I don’t know how to distinguish between the children spawned by my pg_ctl start and other postgres processes that might be running on the machine.
Is there a way to identify a process as a child of another process that I spawned? A cross-platform method of doing this from Python would be ideal, but I’m willing to write a C extension if there exist APIs on Windows/Linux/UNIX to do this.
Here is a simple shell script solution:
This prints out the PIDs of the child processes.
I’m not aware of a standard Python module that can get that information.
pgrepgets the information from/proc, so you could reimplement that in Python, but I doubt that it would be worth it.