I’m designing a daemon that will continuously read lines from a single text file and process those lines. What is a good general purpose way to keep track of the last line processed, independent of the file name, in the event of lines being written to the text file while the daemon isn’t running?
Every so often, the file is archived and a new blank file is created in its place. The daemon will be stopped for the archival to occur.
My first idea, which seems overcomplicated, is to compute and store a hash and line number of the last successfully processed record. Then, when the daemon is started again, run to that line number and calculate the hash. If the hash matches, continue on processing the next record. If the hash doesn’t match, start over on the file at the beginning, since that will say this is a new file.
I have a feeling there is a good general purpose technique used by log file analyzers or something in a text book that I haven’t had exposure to.
Assuming you have permission, enough disk space and assuming you kill the daemon safely…
Just write the last line processed to a file (upon shutdown of the daemon).
You could wrap each instance of the daemon inside a context manger if you want
from contextlib import contextmanager
http://docs.python.org/library/contextlib.html