I have a small java application/service that watches a root folder and its subfolders using Java 7’s new java.nio.file.WatchService. When a new event occurs (new files, modifies, deletes, etc…) I fire off an rsync execution to copy files from server A to server B (and vice versa). The command uses the –delete option to ensure that files deleted from A are also removed from B. However in order to use this feature, you have to enable -r (recurse subdirectories). Normally this wouldn’t be a big deal, but the root folder is 5GB of data (19000 files, 1500 folders). Rsync is great at what it does, but still takes several minutes to run.
The problem I have is that if files change on both servers at roughly the same time, there is the potential that a new file created on server A would get deleted by the process syncing B-> A, since --delete only compares source to destination and sees the destination has more files than the source.
Since I’m already recusively watching every directory with the Java application, I don’t have to use -r (recursive) with rsync. My first thought was to limit the depth of recursion with rsync, but I don’t think that is a feature of rsync. I also considered using --exclude but am not sure what the pattern might look like.
Anyone have any ideas?
For reference, here is a sample of the generated rsync command:
rsync -r --no-group --no-owner --no-perms --update --checksum --verbose --progress --stats --delete --ignore-errors "/media/server1files/" "/server2::server2"
To exclude subfolders even when using
-rthe appropriate pattern to use is--exclude "/**/*".