This following script is used for running parallel subprocess in bash,which is slightly changed from Running a limited number of child processes in parallel in bash?
#!/bin/bash
set -o monitor # means: run background processes in a separate processes...
N=1000
todo_array=($(seq 0 $((N-1))))
max_jobs=5
trap add_next_job CHLD
index=0
function add_next_job {
if [[ $index -lt ${#todo_array[@]} ]]
then
do_job $index &
index=$(($index+1))
fi
}
function do_job {
echo $1 start
time=$(echo "scale=0;x=$RANDOM % 10;scale=5;x/20+0.05" |bc);sleep $time;echo $time
echo $1 done
}
while [[ $index -lt $max_jobs ]] && [[ $index -lt ${#todo_array[@]} ]]
do
add_next_job
done
wait
The job is choosing a random number in 0.05:0.05:5.00 and sleep that much second.
For example, with N=10, a sample out put is
1 start
4 start
3 start
2 start
0 start
.25000
2 done
5 start
.30000
3 done
6 start
.35000
0 done
7 start
.40000
1 done
8 start
.40000
4 done
9 start
.05000
7 done
.20000
5 done
.25000
9 done
.45000
6 done
.50000
8 done
which has 30 lines in total.
But for big N such as 1000,the result can be strange.One run gives 2996 lines of ouput,with 998 lines with start ,999 with done ,and 999 with float number.644 and 652 is missing in start,644 is missing in done.
These test are runned on an Arch Linux with bash 4.2.10(2).Similar results can be produced on debian stable with bash 4.1.5(1).
EDIT:I tried parallel in moreutils and GNU parallel for this test.Parallel in moreutils has the same problem.But GNU parallel works perfect.
I think this is just due to all of the subprocesses inheriting the same file descriptor and trying to append to it in parallel. Very rarely two of the processes race and both start appending at the same location and one overwrites the other. This is essentially the reverse of what one of the comments suggests.
You could easily check this by redirecting through a pipe, such as with
your_script | tee filebecause pipes have rules about atomicity of data delivered by singlewrite()calls that are smaller than a particular size.There’s another question on SO that’s similar to this (I think it just involved two threads both quickly writing numbers) where this is also explained but I can’t find it.