I have a log file, which is logged by a real-time running script, now I am going to monitor the status of the script from the log at WEB/HTML, so I use JavaScript to update the WEB/HTML, also I write one CGI script to parse the log and output event status to JSON for JavaScript’s reading. The JS script periodically(every 2 second for example) invoke the CGI to parse the log and invoke getJSON to read event status, then update the WEB.
eg. at time T(second), the log file logged:
event 1 start …
doing event 1 …
event 1 pass …
event 1 end …
at time T+2(second), the log file logged:
event 1 start …
doing event 1 …
event 1 pass …
event 1 end …
event 2 start …
doing event 2 …
event 2 fail …
event 2 end …
the CGI at time T(second) may output:
{“event”:[[“event 1”, “pass”]]}
at some URI, which will be read by JS’s getJSON
and at time T+2(second) may output:
{“event”:[[“event 1”, “pass”],[“event 2”, “failed”]]}
So here, the CGI script I implemented will parse the whole log every 2 second, which may consume lot of system resource when the log is large and do repeated things for the event that already done.
Anyone have a idea how could I parse the log incrementally by output and not whole log, and how to store the already done event’s status?
If you don’t want to parse the entire log file every time, you should try to mimic the behavior of tail -f:
mtimewith stat.When the next call arrives, compare the inode number with the one you saved:
mtimehas changed, seek to the old position withsetposand resume parsing.With this solution, you’ll be able to parse the file chunk by chunk. Be careful, you might sometimes have an edge case when only a part of a line has been written.
EDIT: @mob’s comment