Basically I want to move files to another server on creation preserving the directory structure. I have a solution put it lacks elegance. Also I feel like I’m missing the obvious answer, so thanks in advance for your help and I totally understand if this bores you.
The situation
I have server with limited disk space (let’s call it ‘Tiny’) and a storage server. Tiny creates files every once in a while. I want to store them automatically on the storage server and remove the originals when it’s safe. I have to retain the directory structure of tiny. I don’t know in advance how the dir structure looks like. That is, all files are created in the directory /some/dir/ but sudirectories of this are created on the fly. They should be sotred in /other/fold/ on the storege server preserving the substcrutre under /some/dir. E.g:
/some/dir/bla/foo/bar/snap_001a on tiny —> becomes /other/fold/bla/foo/bar/snap_001a on the storage server. They are all called snap_xxxx wgere xxxx is a four letter alphanumeric string.
My old solution
Now I was thinking to loop over files and scp them. Once scp is finished and returns without error the files on tiny are removed with rm.
#!/bin/bash
# This is invoked by a cronjob ever once in a while.
files=$(find /some/dir/ -name snap_*)
IFS='
'
for current in $files; do
name=$(basename $current) # Get base name (i.e. strip directory)
dir=$(dirname $current) # Get the directory name of the curent file on tiny
dir=${dir/\/some\/dir/\/other\/fold} # Replace the directory root on tiny with the root on the storage server
ssh -i keyfile myuser@storage.server.net \
mkidir -p $dir # create the directory on the storage server and all parents if needed
scp -i keyfile $current myuser@storage.server.net:$dir$name \
&& rm $current # remove files on success
done
This however strikes me as unnecssarily complicated and maybe error prone. I thought of rsync but when coping single files, there is no option to create a directory and it’s parents if they don’t exist. Does anyone have an idea, better than mine?
What I ended up using after this thread
rsync -av --remove-sent-files --prune-empty-dirs \
-e 'ssh -i /full/path/to/keyfile' \
--include="*/" --include="snap_*" --exclude="*" \
/some/dir/ myuser@storage.server.com:/other/fold/
More recent versions then the one I was using take --remove-source-files instead of --remove-sent-files. The former being more of a telling name in that it’s clearer what files are deleted. Also --dry-run is a good option to test your parameters BEFORE actually using rsync.
Thanks to Alex Howansky for the solution and to Douglas Leeder for caring!
See the –include option.
Just specify it on the command line.
See the –remove-source-files option.