I have a driver script which manages a job string which can run jobs in parallel or sequentially based on a dependency graph. For example:
Job Predecessors
A null
B A
C A
D B
E D, C
F E
The driver starts A in the background and waits for it to complete by suspending itselfusing bash built-in suspend. On completion, job A sends a SIGCONT to the driver which would then start B and C in the background and suspend itself again, and so on.
The driver has a set -m so job control is enabled.
This works fine when the driver is itself started in background. However, when the driver is invoked in the foreground, the first call to suspend works fine. The second call seems to turn into an ‘exit‘ which reports a “There are stopped jobs” and does not exit. The third call to suspend also turns into an ‘exit‘ and kills the driver and all children [as it should considering this is the second converted call to ‘exit‘].
And this is my question: Is this expected behavior? If so, why and how do I work around it?
Thanks.
Code fragments below:
Driver:
for step in $(hash_keys 'RUNNING_HASH')
do
proc=$(hash_find 'RUNNING_HASH' $step)
if [ $proc ]
then
# added the grep to ensure the process is found
ps -p $proc | grep $proc > /dev/null 2>&1
if [ $? -eq 0 ]
then
log_msg_to_stderr $SEV_DEBUG "proc $proc running: suspending execution"
suspend
# execution resumes here on receipt of SIGCONT
log_msg_to_stderr $SEV_DEBUG "signal received: continuing execution"
break
fi
fi
done
Job:
## $$ is the driver's PID
kill -SIGCONT $$
I have to think you are over-complicating things playing with job control and suspend, etc. Here is an example program which keeps 5 children running at all times. Once a second it looks to see if anyone went away (much more efficiently than ps|grep, BTW) and starts up a new child if necessary.