I am currently trying to get a script to write output from other started commands correctly into a log file. The script will write it’s own messages to the log file using echo and there is a method to which I can pipe the lines from the other program.
The main problem is, that the program which produces the output is started in the background, so my function that does the read may write concurently to the logfile. Could this be a problem? Echo always only writes a single line, so it should not be to hard to ensure atomicity. However I have looked in google and I have found no way to make sure it actually is atomic.
Here is the current script:
LOG_FILE=/path/to/logfile
write_log() {
echo "$(date +%Y%m%d%H%M%S);$1" >> ${LOG_FILE}
}
write_output() {
while read data; do
write_log "Message from SUB process: [ $data ]"
done
}
write_log "Script started"
# do some stuff
call_complicated_program 2>&1 | write_output &
SUB_PID=$!
#do some more stuff
write_log "Script exiting"
wait $SUB_PID
As you can see, the script might write both on it’s own as well as because of redirected output. Could this cause havok in the file?
echojust a simple wrapper aroundwrite(this is a simplification; see edit below for the gory details), so to determine if echo is atomic, it’s useful to look up write. From the single UNIX specification:You can check
PIPE_BUFon your system with a simple C program. If you’re just printing a single line of output, that is not ridiculously long, it should be atomic.Here is a simple program to check the value of
PIPE_BUF:On Mac OS X, that gives me 512 (the minimum allowed value for
PIPE_BUF). On Linux, I get 4096. So if your lines are fairly long, make sure you check it on the system in question.edit to add: I decided to check the implementation of
echoin Bash, to confirm that it will print atomically. It turns out,echousesputcharorprintfdepending on whether you use the-eoption. These are buffered stdio operations, which means that they fill up a buffer, and actually write it out only when a newline is reached (in line-buffered mode), the buffer is filled (in block-buffered mode), or you explicitly flush the output withfflush. By default, a stream will be in line buffered mode if it is an interactive terminal, and block buffered mode if it is any other file. Bash never sets the buffering type, so for your log file, it should default to block buffering mode. At then end of theechobuiltin, Bash callsfflushto flush the output stream. Thus, the output will always be flushed at the end ofecho, but may be flushed earlier if it doesn’t fit into the buffer.The size of the buffer used may be
BUFSIZ, though it may be different;BUFSIZis the default size if you set the buffer explicitly usingsetbuf, but there’s no portable way to determine the actual the size of your buffer. There are also no portable guidelines for whatBUFSIZis, but when I tested it on Mac OS X and Linux, it was twice the size ofPIPE_BUF.What does this all mean? Since the output of
echois all buffered, it won’t actually call thewriteuntil the buffer is filled orfflushis called. At that point, the output should be written, and the atomicity guarantee I mentioned above should apply. If the stdout buffer size is larger thanPIPE_BUF, thenPIPE_BUFwill be the smallest atomic unit that can be written out. IfPIPE_BUFis larger than the stdout buffer size, then the stream will write the buffer out when the buffer fills up.So,
echois only guaranteed to atomically write sequences shorter than the smaller ofPIPE_BUFand the size of the stdout buffer, which is most likelyBUFSIZ. On most systems,BUFSIZis larger thatPIPE_BUF.tl;dr:
echowill atomically output lines, as long as those lines are short enough. On modern systems, you’re probably safe up to 512 bytes, but it’s not possible to determine the limit portably.