I have some program that generates a lot of data, to be specific encrypting tarballs. I want to upload result on a remote ftp server.
Files are quite big (about 60GB), so I don’t want to waste hdd space for tmp dir and time.
Is it possible? I checked ncftput util, but there is not option to read from a standard input.
I guess you could do that with any upload program using named pipe, but I foresee problems if some part of the upload goes wrong and you have to restart your upload: the data is gone and you cannot start back your upload, even if you only lost 1 byte. This also applied to a read from stdin strategy.
My strategy would be the following:
mkfifo.ddcould be used for that.touch next_file; for f in ordered_list_of_files; do cat $f >> next_file; rm $f; doneor some variant should do it.You can of course prepare the next file while you upload the previous file to use concurrency at its maximum. The bottleneck will be either your encryption algorithm (CPU), you network bandwidth, or your disk bandwidth.
This method will waste you 2 GB of disk space on the client side (or less or more depending the size of the files), and 1 GB of disk space on the server side. But you can be sure that you will not have to do it again if your upload hang near the end.
If you want to be double sure about the result of the transfer, you could compute hash of you files while writing them to disk on the client side, and only delete the client file once you have verify the hash on the server side. The hash can be computed on the client side at the same time you are writing the file to disk using
dd ... | tee local_file | sha1sum. On the server side, you would have to compute the hash before doing the cat, and avoid doing the cat if the hash is not good, so I cannot see how to do it without reading the file twice (once for the hash, and once for the cat).