I have a Python script that checks on a pickup directory and processes any files that it finds, and then deletes them.
How can I make sure not to pickup a file that is still being written by the process that drops files in that directory?
My test case is pretty simple. I copy-paste 300MB of files into the pickup directory, and frequently the script will grab a file that’s still being written. It operates on only the partial file, then delete it. This fires off a file operation error in the OS as the file it was writing to disappeared.
-
I’ve tried acquiring a lock on the file (using the FileLock module) before I open/process/delete it. But that hasn’t helped.
-
I’ve considered checking the modification time on the file to avoid anything within X seconds of now. But that seems clunky.
My test is on OSX, but I’m trying to find a solution that will work across the major platforms.
I see a similar question here (How to check if a file is still being written?), but there was no clear solution.
Thank you
As a workaround, you could listen to file modified events (watchdog is cross-platform). The modified event (on OS X at least) isn’t fired for each write, it’s only fired on close. So when you detect a modified event you can assume all writes are complete.
Of course, if the file is being written in chunks, and being saved after each chunk this won’t work.