A few days ago, I had a problem with a script dealing with a multitude of files on our EMC mass filers. Here’s now my working code
unset xP_Array
declare -a xP_Array
...
export LOG=$HOME/BIN/somelogfile
export OUT=/path/to/device
...
echo "`date '+%m/%d/%y %T:'` START -- MEM" >> $LOG
echo "`date '+%m/%d/%y %T:'` Go to work directory." >> $LOG
cd ${OUT}
echo "`date '+%m/%d/%y %T:'` Fill the array." >> $LOG
for f in "$OUT"/*XML; do
xP_Array+=( "${f#$OUT/}" )
done
echo "`date '+%m/%d/%y %T:'` Get array length." >> $LOG
Plen=${#xP_Array[@]}
echo "`date '+%m/%d/%y %T:'` MEM: $Plen FILES TO PROCESS." >> $LOG
echo "`date '+%m/%d/%y %T:'` Check if zero files." >> $LOG
date_fmt='%m/%d/%y %T'
if (( Plen = 0 ))
then
printf "%($date_fmt)T: ZERO FILES\n" $(date +%s) >> $LOG
fi
echo "`date '+%m/%d/%y %T:'` Loop." >> $LOG
for i in "${xP_Array[@]}"
do
echo "`date '+%m/%d/%y %T:'` Move file to run directory." >> $LOG
mv $OUT/$i RUN/
echo "`date '+%m/%d/%y %T:'` PROCESSING "$i"." >> $LOG
[[[DATABASE LOAD DONE HERE]]]
echo "`date '+%m/%d/%y %T:'` Check DB LOAD return value." >> $LOG
EXIT=`echo $?`
case $EXIT in
0) echo "`date '+%m/%d/%y %T:'` COMPLETE." >> $LOG
mv RUN/"$i" "$ARCH"
;;
*) echo "`date '+%m/%d/%y %T:'` ERROR. "$i" MOVED TO RECON." >> $LOG
mv RUN/"$i" "$RECON"
;;
esac
done
echo "`date '+%m/%d/%y %T:'` END -- MEM" >> $LOG
I wonder if it could work faster. I am already working with my DBA to see if the database inserts can be sped up but I wonder if the loop itself could run faster.
Btw, all the echo statements are redirected to a log file that I email myself when the script completes. Are they slowing the script down?
Can this script be optimized to run faster?
could be replaced with
but I don’t think that’s a big bottleneck in your script.
In your other loop, the only real overhead I see is the repeated calls to
dateandusing command substitution to assign
$?to EXIT, which can be done directly (EXIT=$?). I don’t think there is anything else to optimize there except the actual DB load.If you were willing to switch from human readable dates, you could assign the current time (as a UNIX timestamp) to
SECONDS, then just reference that variable for the log lines instead of callingdate.With a new-enough bash (4.2 or later, I think),
printfcan format the UNIX timestamp as a readable time: