Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6020045
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T03:30:39+00:00 2026-05-23T03:30:39+00:00

My test equipment generates large text files which tend to grow in size over

  • 0

My test equipment generates large text files which tend to grow in size over a period of several days as data is added.

But the text files are transferred to a PC for backup purposes daily, where they’re compressed with gzip, even before they’ve finished growing.

This means I frequently have both file.txt and a compressed form file.txt.gz where the uncompressed file may be more up to date than the compressed version.

I decide which to keep with the following bash script gzandrm:

#!/usr/bin/bash

# Given an uncompressed file, look in the same directory for 
# a gzipped version of the file and delete the uncompressed 
# file if zdiff reveals they're identical. Otherwise, the 
# file can be compressed.

# eg:  find . -name '*.txt' -exec gzandrm {} \;

if [[ -e $1 && -e $1.gz ]] 
then

    # simple check: use zdiff and count the characters
    DIFFS=$(zdiff "$1" "$1.gz" | wc -c)

    if [[ $DIFFS -eq 0 ]] 
    then

        # difference is '0', delete the uncompressed file
        echo "'$1' already gzipped, so removed"
        rm "$1"

    else

        # difference is non-zero, check manually
        echo "'$1' and '$1.gz' are different"

    fi

else
    # go ahead and compress the file
    echo "'$1' not yet gzipped, doing it now"
    gzip "$1"
fi

and this has worked well, but it would make more sense to compare the modification dates of the files, since gzip does not change the modification date when it compresses, so two files with the same date are really the same file, even if one of them is compressed.

How can I modify my script to compare files by date, rather than size?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T03:30:40+00:00Added an answer on May 23, 2026 at 3:30 am

    It’s not entirely clear what the goal is, but it seems to be simple efficiency, so I think you should make two changes: 1) check modification times, as you suggest, and don’t bother comparing content if the uncompressed file is no newer than the compressed file, and 2) use zcmp instead of zdiff.

    Taking #2 first, your script does this:

    DIFFS=$(zdiff "$1" "$1.gz" | wc -c)
    if [[ $DIFFS -eq 0 ]]
    

    which will perform a full diff of potentially large files, count the characters in diff’s output, and examine the count. But all you really want to know is whether the content differs. cmp is better for that, since it will scan byte by byte and stop if it encounters a difference. It doesn’t take the time to format a nice textual comparison (which you will mostly ignore); its exit status tells you the result. zcmp isn’t quite as efficient as raw cmp, since it’ll need to do an uncompress first, but zdiff has the same issue.

    So you could switch to zcmp (and remove the use of a subshell, eliminate wc, not invoke [[, and avoid putting potentially large textual diff data into a variable) just by changing the above two lines to this:

    if zcmp -s "$1"    # if $1 and $1.gz are the same
    

    To go a step further and check modification times first, you can use the -nt (newer than) option to the test command (also known as square bracket), rewriting the above line as this:

    if [ ! "$1" -nt "$1.gz" ] || zcmp -s "$1"
    

    which says that if the uncompressed version is no newer than the compressed version OR if they have the same content, then $1 is already gzipped and you can remove it. Note that if the uncompressed file is no newer, zcmp won’t run at all, saving some cycles.

    The rest of your script should work as is.

    One caveat: modification times are very easy to change. Just moving the compressed file from one machine to another could change its modtime, so you’ll have to consider your own case to know whether the modtime check is a valid optimization or more trouble than it’s worth.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a text CLI based script written to test some equipment. Some of
Page1 : test.php (open in popup) <script type=text/javascript> function addFiles(aFiles) { if ($('#addfiles').length==0) $('#addfiles').html;
test.html: <html> <head> <script type=text/javascript src=jquery-1.4.2.js></script> <script type=text/javascript src=test.js></script> </head> <body> <input id=but2 type=button
test.txt is a \n split text file: f = open('test.txt','r') f.read(256) But while read
I needed to get an old software (which I didn't develop) to test an
Test case: <!DOCTYPE html> <html> <head> <meta charset=UTF-8 /> <script type=text/javascript src=/cufon.js></script> <script type=text/javascript
I want to copy a table Equipment from one database MyDBQA to our test
I have a macro which defines the model number of an equipment. I am
test.php is a SVG object that's being generated with PHP. <object data=test.php type=image/svg+xml id=SVG
Test data 1: Abc.TestCase For TestCase By Abc.TestCase Using TestCase --> 2 matches 2:

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.