First some background. We have a vendor application which generates logs and configuration files and stores them in a particular set of folders. On its own, it will then gzip logs after a predetermined amount of time.
We rsync these folders to a backup server using a script on that server periodically (at least once a day). To reduce space, we run another script to gzip any file which hasn’t been modified for 30 days. This causes an issue, because eventually the source server will run its rsync and send up the *.gz files to the backup server. Since we will then have a copy of both the older plaintext file as well as the newer GZ file, when our compression script runs it tries to overwrite the .gz file. This creates a race condition.
I am working on the following code snippet to fix it. Here is my test script.
#!/bin/bash
#Array of local directories
localDirs=("./testdir/")
#Loop through local directories
for i in "${localDirs[@]}"
do
#Find non-gz files in current local dir
for FILE in `ls --hide=*.gz $i`;
#If the file doesn't have a matching .gz file, compress it
do if [ ! -f ${FILE}.gz ]
then
echo "$FILE: Gzip doesn't exist"
echo compressing $file
#test to make sure that the file is 30 days old, and if it is, gzip
#find $i$FILE -type f -mtime 30 -exec gzip {} \;
fi
done
done
exit
This is not working – it still seems to be listing every file within the directory, whether or not it has a gzip counterpart. Any other suggestions on the code would be greatly appreciated, i’m still a bit of a BASH novice.
EDIT:
Have modified the code to this based on recommendations (had no idea backticks were deprecated!):
#!/bin/bash
#Array of local directories
localDirs=("./testdir/")
#Loop through local directories
for i in "${localDirs[@]}"
do
#Test set FILE equal to non-gz files in current local dir
for FILE in $(find $i ! -name "*.gz")
#If the file doesn't have a matching .gz file, compress it
do if [ ! -f ${FILE}.gz ]
then
echo "$FILE: Gzip doesn't exist"
echo compressing $FILE
#test to make sure that the file is 30 days old, and if it is, gzip
find $FILE -type f -mtime 30 -exec gzip {} \;
fi
done
done
exit
I have created a file called ./testdir/oldfile.txt, and also a file called ./testdir/oldfile.txt.gzip. It still tries to compress ./testdir/oldfile.txt into ./testdir/oldfile.txt.gzip. What’s strange is that if I remove the compress text, the echos wont show the oldfile listed, since it has a corresponding .gzip file. But it still wants to compress it. Not sure whats causing the behavior.
Here’s the output (with the compress statement commented out):
[logsync@baschinfs01 ~]$ ls -lah testdir
total 12K
drwxr-x--- 2 logsync logsync 4.0K Dec 7 17:18 .
drwxr-x--- 5 logsync logsync 4.0K Dec 7 17:33 ..
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 cat
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 dog
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 duck
-rw-r----- 1 logsync logsync 0 Nov 7 12:21 oldfile.txt
-rw-r----- 1 logsync logsync 32 Nov 7 12:21 oldfile.txt.gz
-rw-r----- 1 logsync logsync 0 Dec 7 16:12 testfile
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 testfile2
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 testfile2.gz
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 testfile3
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 testfile3.gz
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 testfile4.gz
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 testfile5
-rw-r----- 1 logsync logsync 0 Dec 7 16:12 testfile.gz
[logsync@baschinfs01 ~]$ ./test.sh
./testdir/: Gzip doesn't exist
compressing ./testdir/
./testdir/duck: Gzip doesn't exist
compressing ./testdir/duck
./testdir/dog: Gzip doesn't exist
compressing ./testdir/dog
./testdir/testfile5: Gzip doesn't exist
compressing ./testdir/testfile5
./testdir/cat: Gzip doesn't exist
compressing ./testdir/cat
Here’s the output with the compress statement left in:
[logsync@baschinfs01 ~]$ ls -lah testdir
total 12K
drwxr-x--- 2 logsync logsync 4.0K Dec 7 17:18 .
drwxr-x--- 5 logsync logsync 4.0K Dec 7 17:35 ..
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 cat
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 dog
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 duck
-rw-r----- 1 logsync logsync 0 Nov 7 12:21 oldfile.txt
-rw-r----- 1 logsync logsync 32 Nov 7 12:21 oldfile.txt.gz
-rw-r----- 1 logsync logsync 0 Dec 7 16:12 testfile
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 testfile2
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 testfile2.gz
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 testfile3
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 testfile3.gz
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 testfile4.gz
-rw-r----- 1 logsync logsync 0 Dec 7 16:13 testfile5
-rw-r----- 1 logsync logsync 0 Dec 7 16:12 testfile.gz
[logsync@baschinfs01 ~]$ ./test.sh
./testdir/: Gzip doesn't exist
compressing ./testdir/
gzip: ./testdir/oldfile.txt.gz already exists; do you wish to overwrite (y or n)? n
not overwritten
gzip: ./testdir/oldfile.txt.gz already has .gz suffix -- unchanged
./testdir/duck: Gzip doesn't exist
compressing ./testdir/duck
./testdir/dog: Gzip doesn't exist
compressing ./testdir/dog
./testdir/testfile5: Gzip doesn't exist
compressing ./testdir/testfile5
./testdir/cat: Gzip doesn't exist
compressing ./testdir/cat
[logsync@baschinfs01 ~]$
As you can see its still trying to compress the files, even though the rest of the statements in the IF conditional get ignored.
EDIT #2: Finally got it working with some hackery. Here is the final code that is getting spooned into the script (for now until I can find a better way to do it):
#!/bin/bash
COMPRESSWINDOWSTART=2592000
COMPRESSWINDOWEND=2678400
DATE=$(date +%s)
#Array of local directories
localDirs=("./testdir/")
#Loop through local directories
for i in "${localDirs[@]}"
do
echo "Entering $i directory"
#Test set FILE equal to non-gz files in current local dir
for FILE in $(find $i ! -name "*.gz")
#If the file doesn't have a matching .gz file, compress it
do if [ ! -e ${FILE}.gz ]
then
echo "$FILE: Gzip doesn't exist"
echo compressing $FILE
#test to make sure that the file is 30 days old, and if it is, gzip
FILEMTIME=$(stat -c %Y $FILE)
FILEAGE=$(($DATE-$FILEMTIME))
echo fileage is $FILEAGE
if [ $FILEAGE -gt $COMPRESSWINDOWSTART -a $FILEAGE -lt $COMPRESSWINDOWEND ]
then
echo $FILEAGE is greater than $COMPRESSWINDOWSTART and less than $COMPRESSWINDOWEND
gzip $FILE
fi
fi
done
done
exit
This is tested and working in my test cases. Hopefully it merges smoothely into the main script. Thank you everyone for your help!!!!!
Edited in the final code. As mentioned in the comments relying on find was causing some issues I think. Based on what it was doing it looks like the gzip was trying to gzip every file in the directory when it would see ./testdir/ as one of the items in the list. This avoids that by now always using a filemtime and the current date.