I was wondering how, if possible, I can create a simple job management in BASH to process several commands in parallel. That is, I have a big list of commands to run, and I’d like to have two of them running at any given time.
I know quite a bit about bash, so here are the requirements that make it tricky:
- The commands have variable running time so I can’t just spawn 2, wait, and then continue with the next two. As soon as one command is done a next command must be run.
- The controlling process needs to know the exit code of each command so that it can keep a total of how many failed
I’m thinking somehow I can use trap but I don’t see an easy way to get the exit value of a child inside the handler.
So, any ideas on how this can be done?
Well, here is some proof of concept code that should probably work, but it breaks bash: invalid command lines generated, hanging, and sometimes a core dump.
# need monitor mode for trap CHLD to work
set -m
# store the PIDs of the children being watched
declare -a child_pids
function child_done
{
echo "Child $1 result = $2"
}
function check_pid
{
# check if running
kill -s 0 $1
if [ $? == 0 ]; then
child_pids=("${child_pids[@]}" "$1")
else
wait $1
ret=$?
child_done $1 $ret
fi
}
# check by copying pids, clearing list and then checking each, check_pid
# will add back to the list if it is still running
function check_done
{
to_check=("${child_pids[@]}")
child_pids=()
for ((i=0;$i<${#to_check};i++)); do
check_pid ${to_check[$i]}
done
}
function run_command
{
"$@" &
pid=$!
# check this pid now (this will add to the child_pids list if still running)
check_pid $pid
}
# run check on all pids anytime some child exits
trap 'check_done' CHLD
# test
for ((tl=0;tl<10;tl++)); do
run_command bash -c "echo FAIL; sleep 1; exit 1;"
run_command bash -c "echo OKAY;"
done
# wait for all children to be done
wait
Note that this isn’t what I ultimately want, but would be groundwork to getting what I want.
Followup: I’ve implemented a system to do this in Python. So anybody using Python for scripting can have the above functionality. Refer to shelljob
GNU Parallel is awesomesauce:
It will set the exit status to the number of commands that failed. If you have more than 253 commands, check out
--joblog. If you don’t know all the commands up front, check out--bg.