Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7094139
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T08:31:35+00:00 2026-05-28T08:31:35+00:00

I’ve written a bash script on Cygwin which is rather like rsync , although

  • 0

I’ve written a bash script on Cygwin which is rather like rsync, although different enough that I believe I can’t actually use rsync for what I need. It iterates over about a thousand pairs of files in corresponding directories, comparing them with cmp.

Unfortunately, this seems to run abysmally slowly — taking about ten (Edit: actually 25!) times as long as it takes to generate one of the sets of files using a Python program.

Am I right in thinking that this is surprisingly slow? Are there any simple alternatives that would go faster?

(To elaborate a bit on my use-case: I am autogenerating a bunch of .c files in a temporary directory, and when I re-generate them, I’d like to copy only the ones that have changed into the actual source directory, leaving the unchanged ones untouched (with their old creation times) so that make will know that it doesn’t need to recompile them. Not all the generated files are .c files, though, so I need to do binary comparisons rather than text comparisons.)

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T08:31:36+00:00Added an answer on May 28, 2026 at 8:31 am

    Maybe you should use Python to do some – or even all – of the comparison work too?

    One improvement would be to only bother running cmp if the file sizes are the same; if they’re different, clearly the file has changed. Instead of running cmp, you could think about generating a hash for each file, using MD5 or SHA1 or SHA-256 or whatever takes your fancy (using Python modules or extensions, if that’s the correct term). If you don’t think you’ll be dealing with malicious intent, then MD5 is probably sufficient to identify differences.

    Even in a shell script, you could run an external hashing command, and give it the names of all the files in one directory, then give it the names of all the files in the other directory. Then you can read the two sets of hash values plus file names and decide which have changed.

    Yes, it does sound like it is taking too long. But the trouble includes having to launch 1000 copies of cmp, plus the other processing. Both the Python and the shell script suggestions above have in common that they avoid running a program 1000 times; they try to minimize the number of programs executed. This reduction in the number of processes executed will give you a pretty big bang for you buck, I expect.


    If you can keep the hashes from ‘the current set of files’ around and simply generate new hashes for the new set of files, and then compare them, you will do well. Clearly, if the file containing the ‘old hashes’ (current set of files) is missing, you’ll have to regenerate it from the existing files. This is slightly fleshing out information in the comments.

    One other possibility: can you track changes in the data that you use to generate these files and use that to tell you which files will have changed (or, at least, limit the set of files that may have changed and that therefore need to be compared, as your comments indicate that most files are the same each time).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I've got a string that has curly quotes in it. I'd like to replace
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
link Im having trouble converting the html entites into html characters, (&# 8217;) i
That's pretty much it. I'm using Nokogiri to scrape a web page what has
I would like to count the length of a string with PHP. The string
For some reason, after submitting a string like this Jack’s Spindle from a text
I am trying to understand how to use SyndicationItem to display feed which is
I used javascript for loading a picture on my website depending on which small
I have a jquery bug and I've been looking for hours now, I can't

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.