I have a bash script that checks some log files created by a cron job that have time stamps in the filename (down to the second). It uses the following code:
CRON_LOG=$(ls -1 $LOGS_DIR/fetch_cron_{true,false}_$CRON_DATE*.log 2> /dev/null | sed 's/^[^0-9][^0-9]*\([0-9][0-9]*\).*/\1 &/' | sort -n | cut -d ' ' -f2- | tail -1 )
if [ -f "$CRON_LOG" ]; then
printf "Checking $CRON_LOG for errors\n"
else
printf "\n${txtred}Error: cron log for $CRON_NOW does not exist.${txtrst}\n"
printf "Either the specified date is too old for the log to still be around or there is a problem.\n"
exit 1
fi
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code")
if [ -z "$CRIT_ERRS" ]; then
printf "%74s[${txtgrn}PASS${txtrst}]\n"
else
printf "%74s[${txtred}FAIL${txtrst}]\n"
printf "Critical errors detected! Outputting to console...\n"
echo $CRIT_ERRS
fi
So this bit of code works fine, but I’m trying to clean up my scripts now and implement set -e at the top of all of them. When i do it to this script, it exits with error code 1. Note that I have errors form the first statement dumping to /dev/null. This is because some days the file has the word “true” and other days “false” in it. Anyway, i don’t think this is my problem because the script outputs “Checking xxxxx.log for errors.” before exiting when I add set -e to the top.
Note: the $CRON_DATE variable is derived form user input. I can run the exact same statement from command line “$./checkcron.sh 01/06/2010” and it works fine without the set -e statement at the top of the script.
UPDATE: I added “set -x” to my script and narrowed the problem down. The last bit of output is:
Checking /map/etl/tektronix/logs/fetch_cron_false_010710054501.log for errors
++ cat /map/etl/tektronix/logs/fetch_cron_false_010710054501.log
++ grep ERROR
++ grep -v 'Duplicate tracking code'
+ CRIT_ERRS=
[1]+ Exit 1 ./checkLoad.sh...
So it looks like the problem is occurring on this line:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code")
Any help is appreciated. 🙂
Thanks,
Ryan
Redirecting error messages to
/dev/nulldoes nothing about the exit status returned by the script. The reason yourlscommand isn’t causing the error is because it’s part of a pipeline, and the exit status of the pipeline is the return value of the last command in it (unlesspipefailis enabled).Given your update, it looks like the command that’s failing is the last
grepin the pipeline.greponly returns0if it finds a match; otherwise it returns1, and if it encounters an error, it returns2. This is a danger ofset -e; things can fail even when you don’t expect them to, because commands likegrepreturn non-zero status even if there hasn’t been an error. It also fails to exit on errors earlier in a pipeline, and so may miss some error.The solutions given by geocar or ephemient (piping through
cator using|| :to ensure that the last command in the pipe returns successfully) should help you get around this, if you really want to useset -e.