How do I avoid breaking the HD? I have a bash script running on an ubuntu machine, with this meta code:
bash1.sh
while(true)
run bash2.sh
sleep 60 seconds
done
bash2.sh:
if(directory is empty): exit
process file
delete file
The directory is network shared, and the computer is not doing anything else. Once per day a new file arrives and is processed. (I do know that bash1.sh can be replaced by watch). My concern is that bash1.sh is reading bash2.sh everytime – that can presumably be avoided by only having one script!? and bash2.sh is reading the same directory everytime. Is the directory really read from the HD, or is ubuntu somehow caching the dir in ram? -so it is only read when something changes? is it a problem that it is the same place on the HD that is read every time, or does it not matter because the HD is already spinning? If the HD never sleeps, does it matter if I set the loop time down to only one second?
Maybe the directory could be a pure ram dir – how do I do that? -or is there some simple way to check if something has arrived over the network without reading the directory?
Reading a file or directory once every sixty seconds is not excessive use.
Seriously, don’t worry about it.
If it’s really worrying you, you can rethink your strategy for detecting the file.
For example, do you really need to know, within sixty seconds, that the file has arrived? Can it arrive any time during the day? Can some parts of the day be considered unlikely?
Using information like that, you can adjust the timing of checks to suit. If the file is supposed to be delivered after 4pm, don’t check for it at all before then.
Check for it every sixty seconds between 4pm and 5pm, then every ten minutes after that.
These are all business-related decisions that can be made but I would still suggest that it’s unnecessary. Provided you regularly back up your disks (and have standby hardware if you need to be back up in a hurry), you shouldn’t lose anything.
In fact, if you were really paranoid, you could dedicate an entire machine for this, whose sole purpose is to receive the file via FTP and, when it arrives, send it across to your real processing box.
Put nothing else on that machine and have a warm standby (exactly same software, IP address and so on but powered down) so that, if it fails, the standby can be activated in minutes.
The real processing machine is then only written to once a day – that’s unlikely to affect the disk lifetime.
That’s probably too paranoid for my liking but it shows that there are ways to mitigate almost any problem.