I have a simple loop like:
#Core Loop
chunk_size=1000
while True:
line_c = 0
chunk_array = []
while True:
line = sys.stdin.readline()
line_c +=1
m = line_regex.match(line)
if m:
chunk_array.append(m.groupdict())
if line_c >= chunk_size:
#print top_value(chunk_array, 'HTTP_HOST', 10)
print stats(chunk_array, 'HTTP_HAPROXY_TT')
break
The script is called as a unix filter, for example:
tail -f /var/log/web/stackoverflow.log | python logFilter.py
Instead of printing every X lines, what would be a good way to refactor this loop to do every X seconds?
Reference:
Stats function:
def stats(l, value):
'''stats of an integer field'''
m = []
for line in l:
if line[value].isdigit():
m.append(int(line[value]))
return "Mean: %s Min: %s Max: %s StdDev: %s" % (mean(m), amin(m), amax(m), std(m))
The input will be lines of a web log file, the line_regex turns them into field value pairs (groupdict). The output when using the stats function is like:
tail -f /var/log/web/stackoverflow.log | python logFilter.py -f HTTP_HAPROXY_TR -t stats
Mean: 183.43919598 Min: 0 Max: 3437 StdDev: 321.673112066
Mean: 182.768304915 Min: 0 Max: 2256 StdDev: 255.039386654
Mean: 142.672064777 Min: 0 Max: 1919 StdDev: 208.870675922
So those stat lines are printed every time the script has received 1000 lines. Instead of doing it every X number of lines, I would like to change the loop so this happens every say 10 seconds.
Do this